DOCDHENT EESUHE 



ED 125 50a 

aUTHOB • ' 
TITLE 

INSTITDTIOK 
SEONS A6ENCX 
POB CATS 
NOTE 

AVAILABLE F50H 



HE 008 C99 



SDES PHICE 
LESCiilPIORS 



Hiltoji, Ohmer; ' Edgerly, John W. 
The Testing and Grading of Students. 
Change Magazine,- New Hochellej N.Y. 
Ford' Foundation, New York, H.Y. 

76 • 

Change Magazine, NBH Tower, New Eochelle, Sew York 
10801 ($2,95) 

«F-$0.83 Plus Postage. HC Not Available from EDE3. ' 
♦Achievement -Bating; '''Achievement Tests; 
Bibliographies; Qognitive Objectives; Essay Tests; 
♦Grades (Scholastic); ♦Grading; ♦Higher Education; 
Learning Motivation; Measurement Goals; Measurement 
Instruments; Test Coi^struction; Test Reliability; 
Test Results; ♦Tfest Validity 

ABSIEACT . ' 

Although dyer 100 million test^ are ad!ninj.stere'd each 
yeaj: and testing is a subject of increasing contention among 
studehts, faculty meiDbers remain^ diffident^ A better understanding of 
the purpose atid structure of evaluating mechanife;ns is a prexaguisite 
for widespread improvement. Teachers must understand what factors 
play a part in measuring learning. "If learning goals and course 
objectives are properly defined, they will be essential ingredients 
of success for student and teacher alike. Since multicie-choice and 
essay tests are most commonly used in college today, a thorough 
analysis of their structure and purpose is undertaken to clarify ^ 
underlying principles of evaluation as a learning tool. Letter 
grading, the most commonly accepted 'form of evaluation, is 
particularly susceptible to the charge of insufficient feedback -to 
f lie student. A more fundamental grasp of the options for academic 
measurement is the most, direct route to improved grading. Growing 
external pressures are forcing faculty to reexamine student 
evaluation. The use oi external e:^aminers and tKe establishment of 
ef fectdve^<:ampus grievance arrangements are only two of the ways 
recommended to improve an increasingly botherso^ae issue in academic 
life. (IBH) 



K Documents acquired by ERIC include many inforaal unpublished * 

* materials not available from other sources. ERIC makes every effort * 

* to* obtain the best copy available. Nevertheless, items of marginal * 

* reproducibility are often encountered and this affects the quality * 

* of the microfiche and hardcopy reproductions ERIC makes availajxle * 

* via the ERIC Document Beproduction Service (EDRS) . EDES is not *^ 

* responsible for the quality of the original document. Reproductions * 

* supplied by EDRS are the best that can be made from the original. .*' 

:*(****************** *****************3f(****** ********************** ***3^** 



ERLC ^ 



\ I 



295 

A Qiange Publication 




The Te 



andGrading 
of Students 



us OEPAHTMENTOFHEALTH. 
C< E0UCAT10N A WELFARE 

* >ATIONALIN$TITOTeOF 
r EDUCATION 

THIS DOCUMENT »*AS ftEEN REPRO- 
. OOCeO EXACTLY AS RECEIVED PR^M 
THEPERSCNORjDRGANiZATiONORlOlN- 
ATINGIT POINTS OP VIE^W OR OPINIONS 
STATED DO NOT NECESSARILY R6PRE. 
SENTOFPICIALNATIONAL INSTITUTE OP 
EDUCATION P0S1TI0N,0R POLKY 




By Ohmer Milton and John W. Edgerly 



ERIC 



4 



Library ♦ ^ ' 

/National CvAt^r- \ r' Ffi/her Educatfon 



Th$ Qwige Policy Papers 



1. Faculty Development in a 
Time of Retrenchment 

2. Colleges and Money 

3. The Testing and Grading of Students 



ERIC 




('opvnsht 1976 bv Chan>if MiijE^azme and Educational Change 
All rmhts reservpd 

Printed in the Unitud^States of America 
LC 76-203 

ISBN ^.015390-05-1 



The Testing and Grading of Students /.s one of a series of 
policy papers to help American faculty become more effec- 
tive professionals. This volume has been published under 
a grant from The Ford Foundation. 



\ 



\ 



SIX 



About T^s Special Report 

- " 1 

Wiil-N V1FAVF.D AGAINSr THE CURRENT CRESCENDO OF' 
,X,l.tanan sontinients-or at least egalitarian rhetoric-fhe sub- 
,c' . of Muci.nt testmg.- let alone it^desifin. leads.one to some asc.- 
,a.,n« tc.rmulations about tbe very essence of an education But np 
n.a,ter what one's ideological stance the, po.nt rema.ns that the 
■v.crl.l outs.do" metes out rewards and penalties pretty mucn ac- 
... relink to nne-s compc-nce and talents. Th.s l>^>"8,the current 
Stat, nhrffa.rs. the least one can hopafor is a less d.ff.dent effort by 
... adcm.r:s evorywhere tj^'valuate student performance according 

0 Uu3 fairest criteria Mailable. There is «ow abundant ev.- 
d,.n< .'-:and nT)t ofily gleaned from student dissatisfaction-that 
(tostins:m<l grading are.often dispensed with an arbitrariness wor- 
Mhv rjf a Kublai Khan. , ^. , 

An appropriate design for tests should confirm the e.sential un- 
,i,.rslanding between faculty and students as,to what should be . 
earn -1 and what should not. This khievemei^fis difficult because 
itCans. as Yale professor. A. Barlejt Giamatti has put it. *decidmg 
that 11 IS m fart a limited worl4 that some things are more importan 
than others. -that adjustmfents realistically have to be made. It 
m.'ar.s dcM Hiing that you ^eally know what it is.you want to teach 

'"if i''X . entralitV to Mie learning process that-makes the subject 
nl (.h<.r.-,v s' third policy papef/on testing and f J'-J^^. ' 

•-I h,. editors of Chonge are particularly grateful to the paper s . 
authors. Ohmer Milton and John Edgerly. for their clearheaded por- 
trayal of what is by any measure a vexing and-complex sub)ect. 1 he 
mi horV^ (L Learning Research Center and the CQunseling 

nt^r a he Uni4rsitv of Tennesiee at Knoxville. resp^^^^^^^ 
ar. Widely regarded;as sensitive authorities on^he subiect of humaH 
assr.ssmnnt./rhe preparation of Ih* manuscript was faci itated by 

1 inds Trom the American P«y^-hol°8i^«> ■ 

Foundation, and the University of Tennessee. Thi§ Change puW ca- 
Hon has been made possible under a separate Ford Foupdation 
orant. which we acknowledge with thanks. 

A number of individuals and organizations have contributed to 
the final formulation of this policJy paper. The-authors and editors 
vi.l t thank them for their counsel on a difficult and much debated 



seveu 

4 



subject. They include John Bevan, College o'f Charleston; Kenneth 
Kble. University of Utah; John Gillis, Chapman College; Linda.Kahan, 
Evergreen State College; Lee McDonald, Pomona College; Robert 
O'NeiK Indiana University at Bloomington; Robert Van Waes, 
American Association of University Professors .(AAUPJ; Francis j. 
Wuest, Association of American " Colleges (AAC); and Norman 
Frederiksen and PAul Diederich of the Educational Testing Service. 
Prior to publication, the AAUP anil AAC endorsed this p^olicy paper 
for its serviceability. 

There is. on the editoVs* side, only one further wish: that this 
publicati^Dn Wll be studied by thousands t)f faculty with as much 
rare as was prft into its preparation. One iteed not agree with every 
^ nuance and eVory thought expressed h^jre: One need only be open'to 
the possibility for learning much about'the neglected subject of stV 
evaluation. Here is as good ^starting-point as any to bring ra- 

\ Uonal planning into what remain, surprisingly, still rather unchar- . 

_.,J(^d waters of academic life. 

George W. Bonham 
January 1976 



-.1 



J 



eight 



The Testmg and Grading of Students : 
Why, Where, and flow? 



\ 

' The Malignancy of Testing ^ 

Throughout American-higher ejjlucation, over 100 million tests are ad- 
ministered each year.' Although testing is a subject of increasing contert 
tion arnong students, faculty members remain diffident. But a better un- 
derstanding of both the purpose and strubture of evaluating mechanisms 
jbecomes a prerequisite for widespread improvement.' Page 11. 



1 - 

Setting Learning Goals 

Teachers must understand what factors play a part in the mea^rement* 
of learning. Facylty ^ho pay ample attention to course content are often 
vague a^out the process of evaluation. If learning goals and course ob- 
jectives are properly defined, they will be ^^ssential ingredients of success 
for student and teacher alike. Page 19. . 



3 ■■ 

' Constructing Tests 

Faculty widely confuse concepts of measurement and student evalua- 
tion. Regardless o^ the test construction chosen, both concepts must be 
carefully kept in mind. Multiple-choice and essay ;ests are .most com- 
monly used in college today. A thorough analysis of. their structure and 
purpose clarifies underlying principles of evaluation as a learning tool. 
Page 27. \ 

- 8 




nine 



Grading 

A compreheiisive evaluotjon of siudant performance should provide 
guidance for academic improvement, but students^ t90 often* receive 
ocanf critical commentary on their progress. Letter grading, the most 
rommonly accepted form of evaluation, is particularly susceptible to the 
charqe of insufficient feedback to th^ student. A more fundamental 
qrasp of the options for academic measurement is the most direct route 
to improved grading. Page 43. 



5 

Lone Efforts Are Not Enough m ' 

Growing external pressures are forcing faculty to take a fresh look at stu- 
dent evaluation. Th6 new consumerism, recent legal decisipns, andjar- 
reaching social criticism will no longer leave matters of grading and test- 
ing to the private academic preserve. The use of external examiners and 
the establishment of effective*campus grievance arrangements are only 
two of the ways recommended to improve an increasingly nettlesome is^ 
sue 10 academic life. Page 49. 



6 

For Further Reading 

For^a more comprehensive understanding of testing and evaluation, fac- 
ulty have access to a number of excellent source documents. Here are ' 
some of the best. Page 57« 



eleven 



The Maligptocy of Testing 



Throughout American higher education, over TOO miUiOn 
tests are administered each^yeor. Although testihg is a 
subject of increasing* contentiqn among students, faculty 
members remain diffident. But a better understanding gf 
both the purpose and structure of evaluating mechanisms 
becomes a prerequisite for widespread improvement. 



/ 

II 



ERIC 



10 



twelve - 



J 



TuFHH AKK SL1(;HT1A1 MORE THAN -HALF A MILLION 
1,1. .ilt\ ni.'nib«Ts- in Ami'ri. an rulleRus and universities. If each 
I.' I. h.'s an .iviTawitf two ruursps and prepares three tests for 
, M. h . ours.'.- at least three milirnn exams are given during any 
.Hi.irl.T (ir si'mc'ster. Hint o these examinations are administered to 
.,l...at 10 milhun s,tudents. about 30 niillion tests are given everv 
thr.-.' ur fnur niunlhs. or ever 100 million everv arademic year. This 
IS nii-a^urement f>n a gr.ind sr;ale indeed! 
" Considering that major decisions arc made about students 
lives-whether thcv remain in si^hool. enter professional or 
graduate instilytj.ms. secure jobs-partiallv on the basis of those- 
haloed test sh-tistics. the grade point 'averages, olaborato care 
should be required in the entire testing and grading enterprise. Un- 
frirtunatelv. the very terms "testing" and "grading" have come to be , 
used more or Iciss synonvmously. with either one referring tc the en- 
tire pr... ess. Inlhis policy paper each term will be used in a restric- 
ted and distinctive sense. Here, ■•testing" means measuremont: 

■grading"" means assigning an evaluative symybl— A. B. C. D. F Isee 

("hapter 3). •. * . , ■ . u- u 

While there are no documented repprts about the degree to which 
r.ire is exerris ed. a number of factors indicate that too much aca- 
demic measurement in the classroom is conducted in a cavaber 
fashion On the basis of an inspectipn of numerous tests over the 
-years loudly voiced and sometimes embittered laments by many 
students, and observation of too many untutored graduate teaching 
assistants assigned the entire chore of testing and grading, we de- 
veloped a healthy skepticism about the oractices in force. To check 
• these initial suspicions, two dozen college and univerfaty officials 

erJc It " 



thirtei-n 



iM > Ih^< us ,r,»r5s ,tjf«i*'i}ts ,iuii uar l.u \x\\\ ( <.Jw 

4<lJ}ti'>;i,il ♦'V.ilii.iTKiii p?^iijl»'fi(',, 
) ^ <• ^4luU . .{ »• ♦»\.-i!}i!th \V.A\\\ u{ flit' r<-.p.»i}>«»- 1 r 

( 4»ni|iLi!nts Wwmx iVsl ( ontrnt 

#^ 

!l In -in intrntfui (iir\ (oursL- ctuerin^' uppniximalt'K 2.S0<; 
M art of philnsupIiA, fi\f.o^ thi; si!\en questions on the 
final rxani uen? ahuuf Kun(, _ . * 



« 4uisidiTin« fhaf niajur dec isions ^re made aboul 
studrnfs" \vhHher'fhi!V remain in sichool. enter 

professional or graduate institutions, secure jobs — 
'VartialU on the basis of thosR haloed test statistics, the 
) Kradf point averages. elal>Qrate care -^hould be 
required in the entire testing and grading enterprise. 
l*nforlMnaiel\. the ver\ terms "testing ^ and •"grading" 
l2»i\e I ume to he used mure or less synonvnvouslv. uith 
eilh»T one referring to the entire procc*-s. 



\l\ Fit« textbfmks uer6 assigned in a course designed to pro- 
mote understanding of the contributions of several indivi- 
, duals to the field being studied. Lectures and class dis- 
f ussions focused almost entirely on one of these people.' 
. All questions on the final concerned the one person. 
' hi ^rn- fur^ TMM-a 'n rnu« h m*»r<* t .in^ful th.in m.inv v,>,.ni tn hi-rn 

131 Students in a Senior course ucrc assigned 15 journal ar- 
ticles, the shortest of which was 14 pages long. The only 
examination question over this « considerable volume of 
mateWal asked students to match the articles] authors 
with the titles. Challenged by a colleague, the instructor 
argued he could assume that 9 student wh» could do this 
matching understood the material. ' 
ih»- Mirrj'lK ji ni.tifr fi\Jtus prufrssnr rtqiiirc's a m.ifnr luap m 
ioof* \lifr»M.\,.r. futun^ studen!;, mii*ht thu udds and Imji! 
iKrMf If'ariunc tn ni*^njun/.ni» atithnrs and f)tle,s 

1 

ERIC • '12 



fourteen 



ERIC 



Complaiflts About Gnuling 

(4! st.d«ntv«. a technology course were assigned a project ' ^ 
t„ >.,. , o.npleted Individually. They were mformed that the 
ontv sravfins (.riterion would be the quality of the product. 
Scvcl^al student^ devoted many hours to the ass.gnmen 
early in Iho tn^n^ and finished several weeks before tinal ^ 
exams. Other stiidenls maintained a leisurely pace and 
worked throuRh^ut tha term. All the early finishers re- 
ceived E s. while Ithe leisurely ones recerved A s and B s 
The instructor n^aintained that many of the F s were 
awarded because of absences-he had not seen.some of 
these students during the last montK of the course Even 
-Anen rdhiinded that his announced criterion was quality, 
not class attendance, he refused to alter the F s. 

M „n hlu l^rs ar. untluK «m'H-,hv about dass attendance. To be fa 
u'a m...snr.m.n« should moasuro what has been asked for 

and n(»llunu <*l^''- 

(51 A forcicn language class contained students who had 
' leS " broad, graduate students working toward the for- 
eign language exams, and beginners. One error of any 
, k S reduced a ciui. or exam grade to a B. two errors to a 
C. and !vO on. Many studious and responsible students re- 
rpived D's and F's in this course. 
WhdX'l . stan(iar.ls are.important. there sho. Id ho some domon- 
sl.;l;u! nluSip b.twe(,nVcas„nable exp«r-a.i«ns and grading 

rnt(Ti*K 

(61 After assigning quite high n^id-ter»i^rades an instructor . 
declared that students were being ^oTldled. Witl^out any 
warning, term papers were graded very harshly jaftec 
he w^hdrawal daTe had passed). The final was graded^ 
enuX severely. Course grades for the class of over 40 
■ ncluded one a! one 9. and a few »'-./he res were F 
Colleagues of the instructor arranged for the students to 

^J^:::^:^:^^^^ have „o placc In evaluation 

I7i: A freshman was told by an Instructor she would receive a 
' • B rWs course; her final grade was C. The student was 
applying for admission to a competitive prpgr^mrfor 
, which a'few hundredths of a point in her GM m.gf de- 
termine her acceptance, l^^estiga ion revealed that the 
irtstructor was a teaching assistant who had left school 
The department chairman believed this instructors 
. teaching and testing methods had been Question able and 
changed the grade. The stuaent was admitted mto the 

. Mos,rrhfng^'Tntrdo not rocoive formal instrucUon or guid- 
anro S.t tosting and grading. Incidents such as this one may be 
provalont. 

(81 In a course which had a fairly rigid attendance require- 
ment. a stuaent requested an excused absence for a 

13 

/ 



nidiidcit(Tv appeardm o in (ourl and believed the instnic 
tor iilloued the absence, rhe student received a final 
grade of C instead of the R he had expected. The instruc- 
tor explained the C in different ways: to the student— too 
niauv absences (he disallowed the court appearanf:e); to 
the department chairman — inadequate class participa- 
tion (records indicated otherwise); to an administrator—* 
poor written work (papers averaged B). The instructor 
refused to cKiinge the grade, hut an academic srie\aucc 
committee direoted* a (hange. 
II ill nr^trui tor nr>t}|i»»s a ,u:r.ifir m so in, im vvri\s, hnu ( an evalu* 
tnr> fit iIm^ sfmlrnt's trans* ript nit<T[)r<M thr ur^idr? 

'9) A freshman who had manitained a C average on all his^ 
tests received a final grade of F. The instructor explained 
that the student had exhibited an improper spirit toward 
the subject matter and refused to alter the grade. 

Ut^ have MTioiis ii(Mi[jts about \hr proprirt^ ul '^rajiinj-! <i sturl(;nt' 
' Npirit"' or .iftnudi* 



While there are no documented reports about the 
degree to which care is exercised, a number of factors 
indicate that too much academic measurement in the 
classroom is conducted in a cavalier fashion. On the 
basis of an inspection of numerous tests over the 
years, loudly voiced and sometimes embittered laments 
by many students, and observation of too many untutored 
graduate teaching assistants assigned the entire chore 
of testing and grading, we developed a healthy 
skepticism about fhc practices in force. 



ilOi A student with the highest overall point total in her class 
(90 percent) received a Blather than an A as her final 
grade. The instructor explained that his point system was 
absolute and that while she had a **moral A/*' he could 
not, give her an A for the course. During the conversation 
he told her he had given another student a B w»hen that 
student had orjly enough pomts for a C He rationalized 
that there was a difference between giving someone a B 
and giving someone an A but did not explain what the dif- 
ference was. The student appealed to the grievance com- 



^^Klergra(hiat**s sUiiy that the\ frequent^ en("ounter this^profes- 
varial attitude, althfa^^h soldoin in surh a blatant im;ident. Assigm 
u»g a )L:rade in su( h n manner is not rosponjiible evaluation. 



til) A female student was informed by a professor: **Women 
do not belong in my field.** Her grade for the course was 
significantly lower than the average she had maintained. 



mittee. 




1 



sixtoen 



Tho iniuhtice was icHtifiod with the assistance oi the de- 
partment chairman. » j * i. 

S.M Ml innndii V. must hr vhnnuMM m t^valuatin^ student a< hun- 
nu^ni .,Tui r\Un \ shnuM in* n^ih' to mimmi/o pprsonal pre^ 



(.omplaint Ahout Test "Conilltions ^ • 

1121 A filial was administered to 130 students in a crowded 
classroom where it was easy for students to copy front 
each others' papers. Although many students thought the 
situation was unjust, the instructor refused to change the 
lest Jocation. blaming the institution for assigning .too 
manv students to the class and for providing the small 
classroom. An admimstratori the department chairman, 
^ and the college dean intervejied. and the test vas given 

ajjain under satisfactory conditions. 
hMti;*i>iatr and uaproptT t(»stinu (onditions are inox(^usablo. 



Most student complaints seem to be about grades or tKe 
symbols, not about testing or measurement where the 
basic problems are. Apparently most students are ^ 
unaware of the fundamental issues m measurement and 

evaluation and do not know the questioris they should 
be asking. Th^y are not alone in seeing just the tip of the 
evaluation iceberg. Thousands of studies have been 
conducted ifbout grades and grade point averages 
IGPAs]. but the measuring devices from which those 
* ymbols* are derived are rarely questioned. 



ERIC 



Must ot tlu! stiulcinls" . (.iniil.iints sin;m to be about grades or tho 
sMiii uls. ant about t(>stuiK or- moasuromont where the basic prob- 
i«'ms arc Apparently most students are unaware of the fundamen- 
tal issuHs in measurement and evaluation and do , not know the 
ciuestmns thev should b.; asking. Tliev are not alone in seeiiig lUSt 
the tip oi th.' evaluation iceberg. Thousands of stuchos have been 
, ondiii>>.l about Hrad.is and «rade point averages (CPAs), but tlie 
meaeMing devK (!s from vyliich those symbols are derived are rarely 
quest II lilt 'd 



■Study Influences * 

The f>fl<'i ts oi tfisting upon.leariung have been almost totally ignor- 
ed yet pxperiniental scientists have been concerned for many years 
with the (;ffects the cict of measurement has upon the object or ph()- 

15 - ' , 



seventeen 



ivi^'Utm \muu mrasurfnl A ^und e\<unple is a bhuur-pressure read- 
inu At least two ffjatiin^s of the a( I of measuring blood pressure dis- 
t.»rt ihv true reaiiuu;. the pressure of the innatod cuff and, for some 
people, the emotional rear tion to the procedure. The reading is false 
to some? degree borause of either or both of 'these. 

nerati >iis of students havU4 tcJd their fac^ulty that ttJsttng influ- 
em es thrm Thev studv a? ( ording to the type of test thev are going 
to take and m so douig le<irn different features of the material. A 
lew Mudtes support their assertiojis. Me^er found, by analyzing 
noff.s niade^ and the booklets which contained new material to be 
leanif'd. that a smaller per( entagc? of students who were to receive 
essav test used underlining and a greater percentage of them 
muir sunimarie^s than students who were to take objective tests. 
Ihomas and Augstein found that students who were informed that 
th^'ir test nn a paper on genetics would be in essay form, but who in 
far t tnok objective and essav tests^^ierformed better on both types 
than (hd studtaits who studied the same material under the impres- 
sion that their test would be objective (but received the two types). 
Vv\U>v <ind Dapra d^?monstrated tKat comprehension-type questions 
wrr(» mon^ (»fh>f live for enhanr ing prt)blem solving than verbatim- 

t\[)r qurstiufis. 



Directions 

It soems hk(?lv that traditional testing and evaluation practices-^ 
written tests c overing subjef t matter and grading on curves-^will 
< ontinue on a grand st ale. especially in lower-division courses. Thp 
romamdfT of this policy paper is devoted primorily to introducing 
lai ulty mtimhers to basic principles of measurement and some of the 
pnimment unresolved issues of grading. Improvfid^e^Httrdeviees- 
and pra( tio;s- will help learning .and grading. Numerous volumes 
have been written about most of the topics that we only mention: 



raretuilv selected references are given in Chapter 6. The purpose 
throughout this volume is to alert faculties to some exceedingly 
qomplex problems. The /measurement of learning, the assigning^of 
grades, and determining the significance of the process are inordi- 
nately complicated procedures. 



16 



nineteen 




Setting Learning Goals 



Teachers must ^understand what factors piqy a part in the 
meosurement of learning. Faculty who pay ample atten- 
tion to course content are often vague about the process 
of evaluation. If learning goals and course objectives are 
properly defined they will be essential ingredients of suc- 
cess for student and teacher alike. 



ERIC 



17 



twenty 



I 

9 



ERIC 



One of THR STUDENT-REACTION-TO-lNSTRUCmON FORMS 
used at the University of Tennessee in Knoxville allows instructors 
to write in extra items. Just before a term ended, one instructor 
wrote in. "Rate your progress on the coufse objectives. We sug- 
gested that he might list the objectives himself and. ask the students 
to rate their progress on each one. He replied he wasn t certain 
what the objectives were, but he would try to determine them after 
the course was over. . ■ 

How does one measure, at all tf one does not know what one wish- 
OS to measure? Instructors should not be like St. Augustine when he 
declared. "For so it is. 0 Lord my God. I measure it; but what it is I 
measure. 1" do not know." 



Goals and Objectives 

♦ ^ 

It should go without saying that effective evaluation (testing and 
crading) is based on well-established goals and objectives, yet fre- 
quently it is not. Faculty devote great amounts of attention to the 
content of their courses (what to include, how to include it. and- 
what to exclude), but too few give as much time or energy to the 
process of evaluation, even' though the goals and objectives of a 
course and of evaluation are the same. » tu 

Goals and objectives are often thought of separately so that their 
roles in evaluation may be delineated. Goals'may be defined as the 
hoped-for end results or products of a sequence of educa ional 
events. Goals may apply to"^ single course or to a sequential pro- 

18 



twenty-one 

• < 

«ram (.• k nu.ii.r). 01)i..rliv.,s arc th(> short-ranse events in a se- 
(|U('Hic IchIuii; to a ^oal. 

Gcals (Mn hcst h<. nuMsurcui through the assessment of well- 
.IHuuMi ..|j,e< tivos -rtus prin.-ipln is as true for a professor's course 
as If ts l(.r a ( oUegr s curriculum. The goal for this volume is 
■ nfTr'^u'' "mi sraclins ■■ One measure might be the.number 
.1 ta< uitv wlin sec-k to apply tlu) prmciplos expounded. The obieo- 
tivos are f„r faculty to undcfstand the detailed ways of attaining the 
M>al CJne measure mijiht be their performances on a carefully con- 
strut ted written test ovHr the contents. 

(;..als an.l objectives should be statecUn a^s empirical a fashion as 
Possible .so that they will be su.sceptible to evaluation. It is true that 
some educational goals are difficult to state in definitive terms, but 
(lilli. u ty IS no excuse for not trying to come to firips with the clarity 
oi goal statements. , ^ 

Tnu,. some curriculum and course goals do seem to defy evalua- 
' Such goals, often found n college catalogs and course syllabi.- 
usual V run as folbws: The liberal arts education provides the indi- 
vidual with the ability to comprehend the f-reat oMiines of knowl- 
orlg(!. the principles upon which it rests, the scale of its parts its 



It should go without saying that effective evaluation 
I testing and grading] is based on well-established goals • 
and objectives. Yet frequently it is not. Faculty devotfe 
great amounts of attention to the content of their courses 

iwhat to include, how to include it. and what to 
exclyde]. but too few give as much time or energy to the 

process of evaluation, even though the goals and 
objectives of a course and of evaluation are the same. 



lights and shadows. A liberally educated person is identified by 
. ^ quality of mind. Educators insist these respectable and cherished 
goals should not be compromise;!. As stated, they correspond to the 
accepted defmition of a goal as,.gn abstract statement of a hoped-for 
result (Mager. 1972). Thby dp not. however, tell how to achieve re- 
suits. This IS where objectives play a crucial role in describing what 
knowledge skills understanding, and behavior.-; {unv.h as laboratory 
abilities) the students should possess pfter completing their exfceri- 
once of the curriculum. \ 

It is in defining objectives, that many courses and curricula fall 
?h\T h'h complicate evaluation. It is generally assumed 

that the lauded goals are accomplished through various curricula 

Mc!e tht'c'Sor ^'^^'^^ 
Allhougji this presentation of the basic principles of setting goals 
■ -!nH S^'S-'' " .^"^"••"^d with the level of the individual course 
and the individual test, what pertains at this level is applicable to an 
entire curriculum. Courses within a curriculum are assumed to be 
cumulative. The vast majority of courses have prereVisites that 

Er|c • . . ' 19 



twenty-two 



objectives 
'the 



assu-no that ^^^^t^T^^^ 



Matching T/«st Items and Objectives 

One of the most frequent violations of good proeedure in setting ob- 
£uvL t m hievoment assessment is a mismatch 
er t ve and the unit of measurement chosen o assess «• ^^^^'^^ . 
nit of measurement of an objective is the individual test iterfi. and it 
Ts imoeS S"he two b'e well matched. In courses >" measu e- 
m nt' tCh! even 'some bright and 

s^htsfrd^^SK^^^ 

items is t widesp^^^^ ■ -and a practice (mal-prac- 

, cS mo t u enUy in need of improvement. When we dec:e.v the 
studenTbv discrepancies between our words and our deeds, both he 

'-^InirsHa^k^Si. therefore, is to donno ^ J-^^^^^^^^^ 
These should be made as concrete as possible. Then matching me 

■"tSSu^mlXpl'am Itat an e,am did notoover the con- 

Sns of idenSng a student's statu.. »i.h respect to an establtsh- 

=''r;Srt^aSth\"ade.ic nelds =P^ 
..noteS accessible to measurement '^^^^^-^.Z^ ^r^ 

=SiSS^^^^ 

St,^'lJ».ec,i.es»erer. 

sion w^^f^" ♦•^^.°SrS of" (all manifested in writing) 

^roSteSrs (Sited bfmaniiulating laboratory equip- 

'"Xre is pr'obably no better way of stating an objective (or irntiat- 

20 , 



\ 



twenty-three 

«it till' cnifl of this.f (Hirso? ^ 

This Uhiially ^niioratris a low hst of rather lofty^^^oajs that must be 
traiislattnl into ohjoc tivos. Tho koy words arc bo ablo to do: jead a 
m<!|). propani a brief, tosl for.di<ibutes. explain how a bill becomes a 
law. d(».scril)e tho human eye. solve an equation, write an essay. 



Domain Dictates Objectives ^ 

There is perhaps nothini^ more frustrating io students than to be 
told that a course obiertive is for them to be able to wrjte a grarhr 
matirally correct theme of 100 words and then have a ^ell-meaning 
professt:)r discount points for lack of imagination an'df'or creativity^ 
Errors of this sort are commonphire. Not only is this a\i error in sta- 
ting ohje(. fives to students, it is ajso an error in the choice of the ap- » 
/ propriate item format or type. Creativity is an extremely difficult 



One of the most frequent violations of good procedure * ' 
in setting objectives for achievement assessment is a 

mismatch between the objective and the unit of 
measurement chosen to assess it. The basic unit of 
measurement of an objective is the individual test 
item, and it is imperative thai the two be well matched. 
- In courses in measurement, though, even some bright 
and well-informed graduate students have great 
difficulty preparing test items that adequately match 
the stated objectives. Matching is difficult, but not 
impossible, particularly if objectives have been 
carefully stated. 



area to assess— but not impossible. One first defines it and4hen 
chooses or constructs the appropriate items to assess its presence 
'or absence. 

Jhe problem of choosing appropriate objectives and subsev|uent 
test items for assessing achievement within a given domain or area 
A^os point up that the domain determines objectives, to a degree. 
Vv^e arb not sbggesting that faculty back away from trying to assess 
those goals that they regard as important just,because a gorl might 
seem fuzzv. We are not sympa.tHetic with those wiao contend that the 
crucial things within their domain ar'* inaccessible to objective as- _ 
sessment and who often claim that only experience and subjective 
judgment can serve as bona fide assessment. We repeat: If some- 
thing is worth being made a goal, if is worth being objectified! This 
position in no way lessens the admirable qualities of a goal. 

A good example is a course in art appreciation. If a goal of the j: 
course is to appreciate fine art, one simply has to state what the 
. student should be ^h\e to do at the course's conclusion. For 

ErJc 21 



twenty-fonr 

cxanipu!. th.' stmi.jnt miffht bu ovpt'cti-d to Ik; abh? to rhoosd from <i 
list of piuntinss fivo that uould b.> conyiderod as roprostmtativo ol 
fine art bv a panol of (jxports. fine art, in turn; havinjj boeii inad« 
•dofinabl»^b\ (iM.huliiiK frnni'it vii)lations of r harac toristi. s comnioil 
to fine art. '.' i^ood composiHup. porspoctivo. and so on. 

The f la;ni tl).it tht's.; tvpos of assossiiiunt issues are not acr ossitilo 
o,ran! too opon to subuvrtivo interpretation is inaecuratu. Hiore is 
little (luestibn tliat today's dinie-a-dozmi novels will not be J^mor- 
row's hterarv masteo-puM es or Pulitzef Prize winners. There is little 
room for doubt that Dante's Inferno is su,.orior. Subjectivity enters 
when one isasked to indicati; whether one likes or dislikes a book. 
This is a p(!rsonal rendiHon of one's own experience, but to be able 
to dis. ern the characteris-tii s of ^reat literature from a random 
selet'tion of books is sometliint^ someone ';an learn to do and i?iibse- 
nuentlv denionsfrat^^ ' 
■ To emphasize the importan.:e oi dbfining and .specifying perfor- 
mance objectives. Maser (1973) sugKOsts the rather humorous "I ey 
Dad" technique. Here, one places a course objective withm the lol- 
lowins context: "Hey Dad. let me show you bow If«in !" If the re- 
sult of fillins in the blank -is a seeminsly (ibsurd statement, the ot> 
jw tive is too broad and needs clarification and simplification. In our 
example of art appreciation, as a courso objective, the following ab- 
surdity would bo the result: -Hev Dad> let me-flhow ygu how I can 
appreciate fine aft!" This absurdity cart-be obviated by specifying 
the eenerally agrefed-upon component behaviors or performances ot 
art appreciation. The following examples make the initial objective 
more tolerable: ■ • " , 

IIcv t).i.l m.> show \ou how 1 ( .m. whon pr^'scntod with thorn, an u- , 
r.ii.'Iv idcnlifv 10 out (if 10 Roiiaisvim <! piimtinss. suppb their titles 
.111.1 the artist-^- names, nam.- two aiiiiitional paint.nRS oadi has (Ion.!, 
when an.l where ear h lived, three < .mtributions each has made to t 
hisi.irv of art. .iii.l two .-U-ments ot tlieir work that have led them to he 
UidK."«l as oulstamhnc in the hist.iry of art 

In thfs fashion, art appreciation becomes less fuzzy and is more 
easily assiissed.- 



i 

< 



ERIC 



An Illustration 

There are several ways of measuring the extent^hich course ob- 
jectives have been met. As previously mentioned, the domain or 
area does cxerciswsome infiuence over the type of test Or measure- 
ment one uses to assess courseabjectives. There is, however, a 
basic reciprocity between the types of test-employed and the objec- 
tives of a course. For example, it just makes good sense to use per- 
formance (i.e.. observable behavior) to assess the objectaves of per- 
formance courses. Most of the physical sciences require laboratocy 
.skills the attainment of which requires the instructor to observe 
whether the .student can do the task in question. Most academic 
courses are assessed by asking students to perform on a written ex- 
am In other words, instructors are-assessing students ability to do 
something vis-a-vis their response to a written question. Within this 
form of testing, we ask them to demonstrate knowledge of or about 

22 



twenty-five 

in <i variptv of u<ivs. niultiple-i hoir e t(?stmg. matching, truo-false, 
.ind <*ss,iv, t(j rru»i)tH>n just a f(nv of the varitMios. 

An f»\ampl»? demonstrates how a simple objec tive is amenable to 
(litfen^nt testing forms. The basic rase of 'Wlaking a Pat of Coffee" Is 
drawn from Mciger i\{)7A). Making a pot of coffee with an electric 
rof|f»e pot^^cillsfor knowing how to do, definite things; 

nn)is(uiUM>< f < otfrr {>! (hsassrinU,»Miffi-t; pot, (3| r liMn compon- ' 
•Mit. arulpof. Ill msp,>. t . orrtpononts of put, (5) fill pot with water* (6| 
r» M.M.mhNMor{,poiirnts(»f pot. (7) fill Ivisket with f offeo: (8) rpronnert 
i ntt,^,> p„t. f<i) M't iImI on . nft.y pt»t, y 0) HfMp if put is porkmj? properly. 

A stud(*nfs knowh^dije f an be assessed in a v<'irlety of test types. 
()n(» of the obje< tives in teaching coffee making might call for a 
knowledge? (if (or ahilitv to recognize or state) th^ correct sequence 
of i\( tion in making a pot of coffee. One multiple-choice question 
< ould t<ikr' the follow ing form: 

1 of th»« ifrms t)»'lHu, uhi» If is till* first strp ut makiiiv? a pot of foffeo, 
{a| till Ihr l)ask»'f uith M)fj;piiv ' 
ih] nntr it put IS ptTkili^ proprrK 
(< I <i!s,iss»»nihlr i iiifvr pot 
»hsMfr>Iif'( f iofff'f ptft 

An essay question requiring this same knowledge might take the 
lollowing form. Pleasf? describe in no more than 100 words the 10 
important str-ps in maWng a pot of coffee. 

A matching test on ( offoo making might be prepared as follows: 



I «am nthf f s!«>p ami.makr a loniparison right and left list: 

RiRht 

Sf ^ I .hs.or.nort lotfiv pot Slep^ 2 disassemblo coffee pot 

i r liMij * onponcrits arul pot 4 inspect components 

: I'll 6 reassemble components 

.hi) h.skpt with roff»M. 8 reconnect coffee pot 

« 'tfMfvt on roffrt* pot 10 watch to see if pot is perk- 

inn properly 

Ihrn shuffl«» th«* n«ht tist to th'tixo tho followins' 

^st.'P^ 1 iiisM»ntuHt r.othv pot • stop reaH4?Hmble components 

>n |>nn components anrl pot disassemble coffee pot 

S fiU pot \iith waller watch to see if pot is 

7 fill haskrt with roffoo perking properly 

0 set dial r»n (offeo prit inspect components 

reconnect coffee pot 

The matching test for.ihe students would then be: 

Yhv hst on th<* h'ft ront<iins the? correct ordering of steps lf3,5.7. and 9 
of tho 10 appropriati? stop*< in making a cup of coffee. The list on tho 
n«ht contains stops 2,4.6.8. and 10. However, the steps on the right 
have btu»n shuffled. Your task is to draw a line from Step 1 on the left to 
thoappropnat(^Step2 on the right: a Ijne from Step 3 on the left to the 
r orn'ct Step 4 on tho right and so on until you have'cforrectly matched 
all 10' st(»ps in ffieir correct ^enuoncc. ' " - , ■ 
«# * 

\ 

As we shall presently see, test construction's a time-consuming 
task, principaljv because the preparation 'of learning objectives 
must be done with great caxe. This is the key to successful testing.- 

ER?C- 23 



^twenty'Seven 

> 




<>onstructmg Tests 



Faculty widely confase' concepts of measurement and 
student evaluation. Regardless of the type attest chosen, 
both concepts must be carefully kept in nnind. Multiple- 
choice ond essay tests-are m.ost commonly used in col- 
- lege today. A thorough analysis of their structure and pur- 
pose clarifes underlying principles of evaluation as d 
learning tool. 

4* 



24 



twenty-eight 




i 



Our INVESTIGATIONS' into STUDENT ASSESSMENT HAVE 
led to several conclusions: 11) There is real confusion about the con- 
ccpjs of measurement and evaluation. (2) Many faculty members 
believe their discipline is so unique that little is to be learned about 
academk; measurement from faculty of other disciplines^ (3) 
Instructors feel there must be no interference in their testing and 
* grading of students--not even by their own disciplinary colleagues. 

Tests should promote learning. They should assist the student and 
the instructor in determining whether learning goals ar6 being 
achieved. If they do not. then both participants may alter strategies. 
In this private context, formal measurement is of little importance, 
because errors in judgment by the instructor can be corrected and 
honest diffepenccs of opinion ca^ be resolved. Central to exchanges 
between the two is the student receiving detailed criticism of his or 
Her work and constructive suggestions for improving it. 

What has happened, however, is tHat the letter symbpls resulting 
from tests are used almost solely for official record keeping. Many 
instructors do not view testing as pari of the learning process and 
as & result resent spending class tims.on it, return exams to students 
with no correction marks or comments upon them, and never show 
final exam results fo students.^Students. in accepting this limited use 
of tests, strive to gain points rather than to learn. 

In thistJontext, it is difficult to understand how the defensive c^-y 
of "academic freedom" (meaning "Stay away: I'll test and grade as 
1 please") can be justified. Faculty members are fallible. They can 
be capricious (Case 6, -page 14) in their judgments of student 
achievement, and poorly constructed tests can support those judg- 
ments. In the final analysis, it is the student who pays the price; and 



Iwentv-nine 



bi'st students ^Avv harmefl fhr^ must. They aro th(« ones who on- 
^ in -^Mvuit^ j^nibhini^'^ Ium <iiiso thev hopetu ontor graduate and 
prof(^ssiun?d s( hools. and very tinv fractions of CPA points mav de- 
nd(» thoir fatos. 

The tho^ris here is simph'. Siiiro the n?sults ofWasuremont of. 
^ shKJcnt ar hu»v(inii>nt .\vv, i urrontlv us('?d more to sorvb \ho. pubht 

than promote h-arnui^ (thaj is. the results are ma(kravailablo to 
, omployer's and other^to hi} used in the solection process), individual 
tacDitv mombors (*an no longer pretond infallible judgment about 
Student nsM>ssment. While W(! disa.qfee ivith this public^ fuiic tiori 
sine e It uill ( ontinnr; it must In? improved. This chapter will explain " 
iuu] ( Kintv the ronrVpts of^both testinj^ and grading; and introduce 
som(! ol th(» netessarv principles U^v technitjues of measurement 



Concepts: Measure. Evaluate 

^\ 

Tho word "measure'Mias at least 40 different meanings (Lorge). In 
the [)n»sent ( onfext n\?|isure is intended to mean all ihose activities 



Since the results of measurement of student achievement 

are'^currently used more to serve the public than to 
promote learning (that is, the results are made available 
to employers and others to be used in the selection ' 
process], individual faculty members can no longer 
pretend infallible judgment about student assessment. 
While we disagree with this public fmjotion, since it ' 
will continue it must be improved. 



whirh are ne( essary to quantify learning or achievement: the pr^- 
paration of single questions or items, the selection of items or ques- 
tions to make up a test or examination, the conditions under w'hich 
the test.is administprod. scoring each individual item, and assigning 
a snore, number, or quantity to the whole. In everyday parlance, all 
of these activities are referred to as testing. 

Tiie goal of objectivity^is sought in all measurement. In the hands 
of several trained people, the same instrument-whether a rulor, a 
watch a sextant, a sphygmomanometer, an English test--shoi|ld 
yield the same reading. Ebel's (1972) definition applies with equal 
on e to al educational tests: ''A measurement is objective if it can 
t)e v(;rified by another independent measurement. If it cannot be. 
that is. if the measurement reported depends more on the person 
making the measurement than on the person being measured, 'it is 
unlikely to be very dependable or very useful...." 

Jhe greater the care with which an instrument is con.^tructed, the 
greater the likelihood that two or more trained people will obtain 
the same reading (or quantity or score) for t*ie same value or^oper- 
ation. Most people seem to be alert to this principle for physical 

ERiC , 26 ^ 



Ihirtv 



M H-t i 1 h nkiim al ..ut s.-alhui objof tix' to.sK. MuH.pl.--. lu.u <> 

s m- .r" will arnv.> aUb.- s..n..> s. ..rf f..r an oxam.n.u, aft.^r kov s 

:.r.. H .t tlu> s. ..r.> ..r .,uant.tv assigned is .jnlv «no asp.^ct 
i;rvl-:;.n.L nU ir o.bor pr.nripl.^s ..f nu,asiir.>m..nt hav.. boon nppb.Hi 
( ari'losslv. th(> t.'st is n.it objo. live. 

, s a . ...nmon orr.,r ... oc.ua.e .iuau.>f i. a.i.)«. ""^ • 
,„nn,no,i. wi.b ..b,.-«ivi.v. As sta.lt.,r has xp^^^^^^^^^ 
Anion, an min.i so.nns oxtFoniolv vulnorablo t.) tlio b. Iiel 
r« . n -tu.n wbiVh*^an bo oxprossoci in fiuures ,s f'^^" '^j- 
n 1 , X ,< I as tho fiKuros in uhicb it is oxpross^Mi. Upon roflocti.)n. I 
".s . 'r . . . i n.pl.>. an 85 on a tost papor rouUi have boon .io- 
H .ti artn.nlv. an.i'.ho ins.rumont ..n .ho o wl.u^ -t u^s 

, .l.iil.tiV ...ul.i havo b(>on . onstructod p..orlv m tbo lirst pi.irt. 
As w l , ... m. •■..valuat,..n" m.varis arriving at a judsment or 

. ; s a i. . «m In. .ba. .bo Mcul pr.>ssuro is n..rmal or abnormal. 



The greater the care with which an instrument is 
constructed, the greater the likelihood that two or more 
trained people will obtain the same reading [or 
quantity or score) for the same value or opei ation. 
Most people seem to be alert to this principle for physical 
measurements, hut much less attuned to it for educational 
ones. This general lack of sophistication is Ulustrated 
bv the prevalence of superficial thinking about 
so-called objective tes4s. 



r... .Irivor af.or trvin« ... .'..H.... » suun.i info.ma.ion about auto- 
m..!.;!. s. u ;Mi.s .ho ..vuion. and buys car A ratbor ^^^^'^^ 
insiriK ...r ..xan.inos a s.udenfs tes. Pf f"'"'"""' '^\/f„'' , ww s^m- 
sioi. al ...1. .b.) l.>.v.,l of acbiovomon.. and oxpros.ses . in a sym 

r : Th r i^^^^ -bic b iudgmont.s aro based upon 

r .r 0 ulK .-.'n r„, .od and adminis.orod moasunng devices he 
; , a Ho likolibo...! .boy will be sound. Factors to be ^-onside ed n 
ovaluanrlg stiul.u.t arbiovoment (assigning grades) are discussed 
nioro f.dlv in Chapbir 4. 



ERIC 



27 



thirtv-oiie 



Tt\sl-Que»lion Principles ^ 

riiiTi* is at l»\ist oii(» uiMll(!Uiblf iiU I about tustinjii. It tinie-^un- 
siiinuji^' 'IluT(» .in? m sh(;rt { uts to (.onstrui tin^ a j^j)Od test. T(?sts 
fake fiiaiiv forms~multipl(M hoi( (»^ true-falso, ;»ssa^. matching. 
< aniplvtion. pruhliMiis. inte^rprutive. aftil ( unibinatiurs of these. Wo 
ha\p s<'t forih ( (»rtain pnix ipfijs *in(i ro( ()mmeiidali'>ns tliat an? ap- 
ph< able to wnttjjn tosts bucamaj without question sm.n tests are 
usiyl ahnost ov lusiveK in hij^her e(Kication. In thus i onnection the 
' u«>rk of Fbel (lObb. HITzHias been drawn on heavily, and the reader 
nu^ht also see Adkins and Drtissel. 

^ Tlu» basi( unit in a written lest is Hie individual item or 
(piestion — miprov(jment in measurement begins at this point, 
fudijing from the literature, less attention has been devoted to item 
prr-jptiration than to anv other feature tc.it construction. For this 
reason. ( ertam print iples of item pnjparalirn are emphasized, with 
manv examples. 

IiislnK tors whi) prepare item^s or qu(Jstiuns must possess several 
abilities: "\ . 

(horou.iih musiei \ ol the subject matter. Item writers must be 
actiUtimted withf<ii ts and principles, attuned to their implications, 
anti aware of popular fallacies and misconceptiuns. Most graduate 
teat hiw assistants do not have such mastery. 

• A ruTionaJ and w^e/l-deve/oped set of aims or ob/ec(ives for the 
mstrn(ti()n. For most coursers these ^will include helping students 
learn facts and principles, make abstract generalizations, be criti- 
cal and apply what has been learned in other settings. The impor- 
tan( e of aims and objectives cannot be overstressed. 

• A mastery of written communication. Thos^ who have written 
for publication have learned how difficult it is to choose the right 
words and to arrange them to convey, the meaning intended. Stu- 
dents probably give the words in tc"^qu^tions much more critical 
attention than almost any'^ther prose receives. 

• A know7fKJge of the sj^eciol techniques of item ivritin^J and how 
to use them. Some of these will be discussed further on. 

Since the two test forms used most commonly are multiple-choice 
and essay and since our space is limited, we will discuss the devel- 
opment of onl\ those two in some detail. 

/-■*■ 

* Multiple-Choice Questions 

Multiple-choice tests have been condemned roundly by many ^in- 
structors and .*itudents (the latter sometimes refer to them as multi- 
ple-guess). Much of this criticism is well-founded because many 
tests are constructed carelessly. Items tend to be ambiguous and to 
emphasize Ihe trivial. In one study (McGuire), th»*ee judges classi- 
fied test Items that covered knowledge in medical subjects and 
unanimously agreed that over half of the items measured predomin- 
antly recall and i*ecognition, of isolated information. Fewer than one 
fourth of the items were thought by any single judge to require even 
simple elements of interpretation or problem solving. 

Properlv developed, however, multiple-choice tests can tap many 
facets 0^ learning. The principles here set forth are merely inlroduc- 

ErJc 28 



lhirl\-t\vo 

^ 

i(»rv and mav appear titn eplivel\ stmple. but llunr <ii)pli( aticm is 
limcM'onsumin^ and (lt*m<in(linK« IllusliMlive (lueslujus or items* are 
uncomplu ated in the hept; thev will rnahfe the dist iplis^^u v spe( ial- 
ij5t to ft)eus upon the? pnn( iple, 

*^ 

(1) Strive for item ( \k\x\X\, t he Euj^lish Linmiia^e is full of aml)i,uuous 
wt)rds. The^^printeil pa^e ( annut c(Hive\ surh i lues to meaning as 
^oi(,o inflec tiuns and fat lal e\presi*ions./Test items should not bo 
verbal puzzles. A tcist's purpose is to test or measure knowledfio 
rather than vi»rlhil puzzIe-soKinq abiht\. I'he major recommenda- 
tion for attaining tJantv in items or questions is. Every itefn. before 
It is HSfHj. should be r(?spon(]ed io bv a col/eo.truo and by an ad- 
vanced student (the iulter ivi/1 detect va^queness. ambii^uities. hml 
errors the former fm\'ht miss). . ^ . 

(2) fnchide in tht» stem or bodv a/I necessary (jucil/fieations that (fre 
needeti for anstver se/ec tion, Consitier the following multiple-rhoice 
question: 

M .1 ship IS wn^krii \\\ \»T\ tl^M'j) u.jit»r h*»u f^ir uill ii suik'^ 

1 lust' uiul«T th»' surt<»< 

2 lo till* !>ott(mi 

\ ViiUX thi' pn-ssur*' is tn its urj^lit 

4 lo a tir|it!j \\ (If^pf-iuls m p-irl upuu thr amount ut ,ur it ton- 
tarns 

The instrut^tor intended 2 as th^- ( orret t answer, but several capa- 
ble students < hos(» 4 because thev f tmsidered the possibility (which 
the instructor failod to e\( lude) that a wrjcked ship might not sink 
completelv. . :^ . 

(3) General/v* omit nonfunt tionol ivords. They te^id to interfere with 
rr)r/iprehension. Consider; 

VVhilf m<in\ in thi-T S Wax\^\ th*^ jaHatiftn-irv *»ff«M ts a v'^-rnTal lax 
rr(iu( tmiK thrri* v\«is \M(irspr»Mil support f<ii a f»'(i(»ral f<imniuiutv- 
propi>rt\ tax law uudiT uhi< h 

1 hust>«inds^«uiil wnt^s ( nuUi ^pht their * omiancd in< y\m* and fih* s<'p- 
arat»» rtMurns 

I hunu'sti'ads wtmid bp I'xempt frtmi al ri'al «»st,it<* taxos. 
\ st«jt(» UK omi^ t«ix«'% nnt^'ht tt** d<*(iii< tt'd from fi'<Jt"ial returns. 
4 farmland t iXfs wdufd 1m* t^uer 

Comprehension of this item nrav be faialitated bv rewording it as 
follows: 

('t»mmunitv-propcrt\ tax l<»vvs permit 

1 Imsbantis and vmvi-s u> sphi thr-ir ( omljincd in< nnn» and fil« sop- 
iirate returns 

2 homfstr'ads to be exenipt frnm !tual re.il estate taxes. 

3 stato mi ome taxes to !)»» drduf ted on fodoral returns. 

4 farndand taxes to bi; loworod. 

Sometimes, though, it is useful to in( lude introductory statements 
that help to emphasize importance: 

The pollution of striMms in tho more populous regions u\ tin I'nitod 
State's IS f ausin« « (»nsid<;rabhi « on^ ern. VV1j<it is the offer t. if any. <>f 
se\va«e on the fish life i>f a sfream"? 



•rit»'^»y<r» fr-mi Ht«'J H«»h«Mt I \\ t^Uw^Wx* F*" » t»»-ni If* (duralioniil Met^urcrafnl. oi\iWs\ h\ K \ 

29 . 



ERIC 



g thirty-three 

1 It (J»»stnivs lish h\ r(ibl«n« them of ow.'i^n ^ 

2 U p<»isons fish h\ th<» «(»rms it <'arn*'^ j 

* ft fnsttjrs tir\i»lopnu'nt ol mnu'tiibhj ji.mn' fish th.it (h'stnn eihhlo / 
hsfi . 

4 Scuaijt* its»'ll has no h<irniful eff»M t m fish hft», i 

** 

(4) lioivaro of unosseniial spiH:i(iciiy and/or tnvia. Consider: 

What jM'nont o» th(» mi!k supph in rrmuK ipahti(»s of over l.()0() was 
safi'v:uar(i»»(l hvliihrnuhn testing, afmrticm tasting, ami nastcunza- • 
lion^ 

1 111 iM»r( ♦Mit ' 

2 20. J pen on\ 
i U .i p('r( ent 
4 51/) [)<»r( eut 
^» 8i 5"* p{»r<r{»nt 

This itom, encouraging? rote memorizing, is an illustration of the triv- 
ia about whif h so many students complain. Furthermore, such fig- 
ur(»s are seldom as precise as they appear. 

(5) Be cerkiin the stern is accurate. Consider: 

Whv (h(l (a'rmanv uant war m 1914? 
^ 1 Sh<* was following an imporiahstH pohrv. ' 

2 She h<Ml a Ioa«-st<indm« grudfie against Serhia. 
J She \v tinted to tr\ out new '<voa pons. 
4 Fram and Russia hijmmod her in. 

Who is in any position to say that Germany wanted war? Such inepc- 
a( titudos may stren^^then misinformation on the part of students. 

(6) Adapt the level of siifficuHy of the item to the group and to the 
purpose for which the item is intended. Consider: 

If a tree iv j»r«nvin« in a ehmato whore rainfall iK heavy, are large^ 
le<ives <in arlvanta«n or a disadv<intaRe? 

r An advantage, because the an^a for pfiotosynthosis and transpira- 
tion IS increased, 

2 An advantaKO. f)e( ause large leaves protect the tree during heavy 
rainfall 

) A disadvantage, her ause largo loaves give too much shade. 
4 A (fisadvantago. because large leaves absorb too .nuch moisture 
from the air- - 

The above i(f;m illustrates an increased level of difficulty because^it 
roquiFcs knowled^^e of both the answer and an explanation for it. 

(7) Omi* chies to the correct respoi.je. Items that contain clues or 
rues are not measuring what the instructor intended. Including 
clues is perhaps the most frequent error made in multiple-choice 
tests. In the following item it is necessary only to know that **exert" 

,is commonly used with "^pressure": 

What does an em losod fluid exert on the walk of its container? 

1 Energy 3 Pressure 

2 ^ FrKtion 4 Work 

In th(» next item the stem calls for a plural answar, which occurs 
only in 4. 

Among {ho causes of the Civil WUit wore: 

1 Southern jealousy of northern prosperity- 

2 NScuthern anger at interference with the foreign slave trade. 

30 



id 



thirtv-four 



} Itiit.'iii.",; M.'ux ou thf t.infl .m.l ( •.usiitutnm 

'uui ..t 5r.'at,.r l.-auth ll.au th. oti.ors. Stud.-nts . at. 1. on duu klv to 
surh a < liH'. 

I Hii'V wi^iu'ii tn honoi our -illMnM- v. th I mh« 

hi ih.' in^xntom tliPFP are ( omnion olomonts in tho stem and in tho 

WhM \M to \hr form.itiMH ot thr St.ilt's Ki^hts Partv^ 
1 Ihr it\ Irdt-r.il taxation . , . 

j: ihr ih>nuiiu\ ol stat.'s lor the n>:ht to luAr [hvir o\u\ laus 
t lh'-'u«(iustiMli/ation ol [hv South 

lh»" lorruption of m.in\ ut\ ijovornm«'nts 

Finailv such sperifir ( lues as -alL" '^always;* -cerlainlv," and 
"uoKvr an. to be avouh^d-lhov are (luos to incorrect answers. 
Moreover, scholars are leerv of absohites and probably should en- 
( ouraj^e students to be. ^ 

(8) Do not use o ne^^a^ive^• Stated item stem. Experience has shown 
ihl.t those lend to confuse students, yet some items contam two and 
three negatives and seem like intricate verbal puz7-les. 

W'Uv h ut i\u>.. IS not on.- of th.' purposes of Russia in consolulatins tho 
c<minmnist partN or>:am7atu>n lhrou«huut hustorn Europp? 
I In l)ahin< o ihv m^lxmu r of tho \^ostorn democ rar K's 
J lo ho!st«T hpf pfonomu position 
i lo improu- Russian"Am*>rKan r.^lations 
\ A U> unprovv h»T pohtn al bar«ainin« position 

Uhnh f»l these IS not truo of a virus? 
\ It IS »ompose(i of vvrv lar^e livmv! rolls 
^ I4 < an j» pro<iu< v itself. 
\ it i hve onK in plants «»n(l animal rolK^ 
4 It < ati < ausi- (hso.ise 

(9) Br. r.Prlnin that (ho rnrrecl answer is oi^e on which competent 
critics (iijrnf. Consider: 

What IS llH- < l...'f <l>ff"r«'n< <> .» rhs.-arc h work bnlween coIIpko-. and in- 

tlnslri'il firms' . 
1 ( nlUwjs Hn mu<h rfM.ar<li. imiiistrii.l firmOittlo. 

' (:oll..s.-.ufmorr. < nn(<>rnetl with Imsk research, industrial firms 

, O.iw'l'' 'l!r< I th" w..ll-.M,uippr.<i h.borator.os whirh industrial firms 

4 "olw""puhl.sh r-sults. wlulo iatiuslrinl firms knop llioir findings 
s(»i ret 

Compolont authorities could not agree upon the best response to the 
ahove. If this type of item is lo be used, a qualification should be ot- 

fered in the stem, such as. -'According to the chief differ- 

ence....'* 

31 ' 



ERIC 



thicty-five * 

% 

(10) Avotd answer altermitivcs that overlap or inciucJe each other. 

What p('r< t'lit <>* th«* t<)t<ii Iproprrtv | loss dui' to hail is thf loss of 
growing ( rops"^ 

1 I.f^ss than 20 \h*rfi*u{ 

2 . b's^ th.m p<»n <»nt * 

Mun* than 50 p(>rM»jjt 
•4 Mon* than 05M»'r< rnt 

If 1 is ( orroct, then 2 is also correct; and if 4 is correct, then 3 is 
correct. 

This dis( ussion is not intended to suggest that test questions for 
collo>?e studonls should be simple or iests easy. For'the fnost part, 
the examples emphasize item clarity; they do not deal with what 
should bo measured— factual information, concepts, appreciation, 
and so on. Many authorities believe that multiple-choice items, if 
constructed with great care, can measure conceptual knowledge, 
ability togenerahze. and so forth. The way to prepare such items is 
to bo clear about one's own objectives. ofinstruction and to enlist the 
assistance of one's colleagues in judging whether a particular item 
measures what is intended. 



Essay Questions 

« 

For a varietv of reasons, essay questions or items require less prep- 
^aration tmie than multiple-choice ones; on the other hand, the essay 
tvpe requires much more time to score. We estimate, however, that 
for classes numbering around 35 students the instructor would in- 
vest about equal time, for properly prepared multiple-choice tests 
and for properly scored essay ones. Faculty time, howevepris not 
the sole criterion for deciding between the two types of test. The es- 
say question, permitting freedom of responsov can test how students 
iUpproach a problem, what information they think is important, and 
what conclusions they reach. Debates continue over other qualities 
or abilities that essay questions are purported to me*asure (for a re- 
search review, see Yeasmeen and Barker). 

Whatever the merits and faults of essay questions, they afford 
students an opportunity to express themselves in their own words, 
as Stalnaker. among many others, has emphasized. Essay questions 
compel students to think about a topic, decide what to say about it 
and how to sdy it. and do the writing. These are important abilities 
in an educated person, and many faculty members are convinced 
that the development of these abilities has been deterred by the ex- 
cessive use of objective tests. At the very least, essay questions give 
students an incentive to write, ^ 

Most of the principles for promoting multiple-choice item clarity 
apply equally to essay questions. The application of sever^ addi- 
tional principles will increase the chances of a^aining scoring con- 
sistency (or reliability). ♦ 

ii] Linni the scope of the question. There is simply no way of scoring 
fairly such broad questions as ^'Discuss Shakespeare's tragedies*' 
or "Analyze the energy crisis." Moreover, students must guess 
which replies will please the instructor— they mus^ **p8ych out thew 
prof." ^ . . ' , • 

32 



thirt\-six * 



RestrK tioiis ul \\w s( ujni rnav var\. tfl ( niirs(\*hna!s luav hi' nil- 
|)(k<m1 1)v < allum ft)r bri'vitv au<i i nn< iseiujss, insistmu n\nm mh a 
l(»\v sciitiMir ('s. or sptM if\int; th<» siia<f to hf* usod. QiHJstuius 
nui\ he strut lanni \fi t)lh(»r ua\s — h\ <tsVinu sturltuits lu r nnipart*. 
( niitrast. (lis< nnunat<», nnt(» limitations, drav^ InlcnMic I's. slat(? < uii- 
< lusKUis t(»r^(»l\. and so on 

(21 Avoid ihMDsor qiu'sdons rhtit nrt» bustu] oi;, j)oisoruil fe»»lin>»s. Kd-. 
lu <itors arc in no jiosilion to nicasiirt* or ([uantifv sludtMits' feulinjis 
about anv issiu? Sue h (pu^stions as. What duos modern art mean to 
\uu''. Do vnu rolatt* to tho vvritinj^s of o.o. ( unimings?. and How do 
vou feel about Truman as a Prtysidcnit? promott? *'psvchin<^ out th^ 
prof/Mf the answers are hTHfrst. there aro no staedards by which 
the\ ( an b(» (lu.intified. many people niodern^art means nothing: 
others cannot abuh? e e. ( uninihigs. and vvelbinfornKjd persons dif- 
fer about Truman. Where the affective domain is cone ernjid, unfair 
anri improper jiukments are more likely to be render(»di ^n official 
records wh(»n students' teelin>»s and opinions do not agree with 
thosr* of their instm tors. 

(:v He cerfdin that on fKleij[uah» .ansuer con be ,i»iven in ^he i'md 
nlloived: It is am<izmg how often this simple ruh? is violated, even by 
thus(» who know from personal experiem e how difficult it is to or- 
ganize thoughts and pr(»sent them coherently. Again, the issue is 
what is being nieasureii. the qnu k(»st student is not necessarily- the 
best (uie in all respei Is, 

(4) Use ibo fr^Iloiving j>n)cedures for .st^oring essay items, bearing jn 
mind that the subjec t is measurement and the goal of measurement" 
is^ objectivity: ^ ' 

• Ntininii/.e. as far as pi)ssible. ciuis that will klentify the owners , 
of \ho papers; at the verv least, removt? the names. It is all too easy 

to allow extraneous knowh»dge about a student to innuenc6 the 
marking of his or her paper. Such prec nutions should help to assure 
miiioritv stuVlents that the marking process is free of discrimination. 

• Writt* out an ideal answer ahead of lime and ask a colleague to 
do lik(»vvise: t onibiiu? the two into a standard with which students' 
rciplies ( an \\o companni, 

• Sr ore f?ach item on a i)oint scale without reference to a passing 
grath? (assigning h grade is evaluation, not measurement): that is. 
determine prior to storing that item 1 can earn 10 points, for ex- 
ample, item 2 is worth 20 points, and so on. Total all fhe points and^ 
th«»n assign a \o\U'r g^ade. 



Test Construction 

Most ttists are < ompostul of several items or questions that are put 
together for some specific purpo.^. measuring students' aJ)Htty to 
translate a foreign language, for example. Questions on*a given test 
constitute a sample of all the questions that could be asked. There' 
are no hard and fast rules that will produce a representative 
sample of questions, but there are j^iiidelines that will increase the 
chances of a fair distribution: 



t 



ERIC 



33 



1 



thirfy-sevcn 

• Ufivf* t}u> itrnk vcfU'i t imiun timt ob}C( titos von have attempted 
tt, pinfu(ftt> I Ins ,L'uuii»ljn(» is diffv nil to olaborate because ^oals or 
ohHM^!v*'s will var\ Irum t tiurs(» to c oursu. The trirk is to aim for an 
unhi<iM»d sample of qiir^stions. In Case 1 (pa«e 13], the test was ^ 
biased in that'^th(? majoritv of (|uostions \vore on Kant although the 
i^o.il^of (he Miurse were* noj hnuted to understanding that gentle- 
nutn iN'rbdps the simplest vvav to <ivoid an unduly biased sample of 
<}ues(iuns on .1 parlit uKtr test is to have a f olleague criticize it be- 
tnre thr trst is s:^en. 

• CenernJh ^penkin^. the iireater the number o/ items in a testy 
the tiufi i* reprr^eniatn e the sample. This is, one^f the arguments in 
faverol nniltipl(>-< hui( (» items. In a giVen period of time, more jnulti- 

'ple-<hoir(> than j>ss.u questions ran be answered. ^ 

• Alhn\ nm/>/e 'time for oil sMiderifs to respond to ollUhe ques- 
fjons Ihr (»\pMrH»nr (» of r ollea^ues about the optimum length hf a 
ti'st foT^i t^iu'ii time penud uiil be helpful. 



Regardless of the care with which tests are constructed, 

(here will be errors just as there are errors in all 
moasuremenl. In physical measurement, the errors stem 

from at least two sources, defects in the measuri*^*^ 
instrument and perceptual distortions associated with 

the person taking the reading. For educational ' 
measurement there is an additional source^the person 
being measured. The performance of anyone tends to 
fluctuate from day to day for a variety of reasons. The 
goal, then, is to minimize errors in measurement. 



Krrors in Test Construction * t 

R(»gardl(>ss o{ rtie r are with whirh tests are constructed, there will 
V errors just as there are erroFS in all measurement. In physical, 
measurement. th(j errorv^ stem from at least two sources, defects in 
the measuring ins^trument and perceptual distortions associated 
with th(j person taking the reading. For educational measuremenjt 
there is an additional source— the person being measured. The per- 
for'nianr e of {uiyone tends to fluctuate from day to day for a variety 
of r('as'»^^^ Tiv goal. then, is to minimize errors in measurement. 

Regarding the instrument or test, clearly written items and a rep- 
resentatr\^(?^sampling of material will decrease errors a*nd incfease. 
rehabilitv (i.(».. ronsistenry or stability). Al.*?o, generally speaking, 
the hinger a test, the greater its reliability. Multiple-choice tests 
lend to be more njliable than essay oneiS because more questions 
ran be <inswered in a specified period of time. 

As for Hiliabilitv of marking, properly prepared multiple-choice 

ERIC 34 



lhirty-6ight 

t(.sts an' the ItMst siibuM t to error. Sctirin^ of ossavMosts tonds to- 
nvc rd unn'halMhtv or int onsistoiu v. Two or moro instructors are 
likeK to arrive at difforoiit srores. and the* samo instructor mav ar- 
rive at difleront s( or^s at difforoiit times. The reliabilitv^of a given 
test CM deterniiiiod slatisticallw and for lar^e introductory 
euurses sn» li determinations are very much in order. Ebel (1972J 
has i>stmiated tht^ average reliability of colletje tes^ to bo .45. a co- 
elfu lent that reflects unreliability, inconsistency, and imprecision. 
(A perfect coetficiont (»f reliability^ is l.QO.) 

The third Sonne of error, the person being measured, encom- 
nasses the dav-ttf-day personal variations we all experience and the 
l onditions umif^r whic h a test is administered. These can help or 
hinder performance. For the purpose of averaging out day-to-day 
variations, r onducting several tests during a course will tend to 
yield more reliable or consistent measures than givHig a single one. 
Althtiugh wo do not ret.ommend a specific number of tests, it is clear 
that giving only one test for an entire course is likely to be unreli- 
able As for-physical conditions in a room used for a test, inadequate 
ventilation, uncon^fortable temperatures, poor lighting or excessive 
crowding will tend to cause inaccurate measures; in Case 12 (page 
Ui|. ( heating resulted in ina( curate scores. Poor testmg conditions 

are inexf usable, . ' r ♦ 

Rhndes h,.s summarized \ho meaning of errors of measurement: 

It IS .issumril \\ui\ fitr JM« h a slu(l*»nt takt»s. thcro is a truo m oro ho 
sh.mid nMki' that max chffcr fr^m lhr> s, ore ho artually ar hioves Fho 
tnir M nr.' wonUl 1m> frn- of tho ai ruU^ntal error ( ausod bv fa(:tors surh ^ 
as \h*> (jui^stions U'i\ for t\w t(*st. I^m tht> student feels on ihe dav 
nf thi' tost, till- t.*mporaturo i>f the tostms room, and so on. Theorotu al- 
K I* a simhnit took an infinite numbor of oquivalont editions of a test, 
th'.' s^uros hi' «ibt-um'd would vary somewhat hut wtuild rlusUT 
around ail averarfi*, or 1 rui» m ore The sf ore a stutlont a( tuallv obtains 
(»n am m* u i»ri asiun is. th.-n. an approximation of this true score ami 
s^icmld be thitUi^ht of as repreM'ntm^ an interval, ur ohtaine(Woro 
raTivje the hnuts nf wbh h ar^e d»»ternuned bv use of the standard error 
of iTJ'Msuremeiit 

' FinalK. in this respe^ *hijre is a statistical formula' for calculating 
thf^ standard ermr of measurement that is useful for large classes. 

A tost { an be verv reliable, yielding precise and accurate scores. 
* but really not measure. anything of importance. Such a test is of 
course invalid. While there are several concepts of validity (for de- 
tailad discussions, see Ebcjl, 1972), only two need be of dire^ con- 
(.prn--content validity and predictive validity. 

Content validity means that a test measures what it is supposed to 
measure, for example, critical thinking about economics or problem 
solving in calcuhis. Wdll-formulated.objectives for a course are tlie 
first prerequisite for attaining content validity of test items. The sec 
ond requirement is the advice of one*s coUeaguiffi. 

The multiple-choice items in the following table presumably have 
content validity. 



ERIC 



d„ ^ 35 



thirtv-iiind 



< . MultiplfM.hou (» Items hUernhni Id lest 

Various A>|MM !s of Arlne\ement* *^ 
rn(ier,stantlinK''of Terminolo«\ or Vor <ihuiar\ 

f he term ' Inn^e benefits" has been used fretjuenlU m rei ent \ears in 
riinne<tion '.xith biJwir lontraj ts Wh<it does the t<Tm mean' 

1 b»eiiti\e paNmmts for aho\e-avera«e output 

2 HKbtsjjf rniplovees to (iraw overtime p<iv at fn«bc»r nites 

i Ki^lits o( entplov^^rs te share in the profits from inventions ' f their 
emplov»'es 

4 Suih r<»nsK]*'niJn;ns os pcml ^tnatums retirement pions. <iiu] 

^ What IS tlie le^hnual definition (A tfie term "prodijf> tion"' 

I An\ natural proc ess produrin« foo(l ur other ra\x matc>rials 

« 2 Ihr ( re(2(n»n of e(cjnf>mi( vnfues ^ * 

i \hr niamil«i< tore of finished products 

^ 4 I he operation of a profit-naakin« (Enterprise 

Knr»\\ led^e of rarrand^hinple or Gelierah/ati(»ns 

Uhtjt prinnph* is utdi/ed in r<id<ir? » 
• I Fiiint ehntronie ra(h<itions of far-off obRM^ts c.in be deter ted by 
supersensitive uh (•ivfj'rs. * » ' 

^ UkU fn*q\wm\ radia ivdi-es arv rrfhtUd bv distont ob/ects. 
t All o!)jei ts emit infr«ired r«i\s. '»\en in darkn(»ss 
4 Hiiih-frequenc V r<idio \v<ives are no> tMUsmitted ahke by all sub- 

sttUK f»S 

The most f roquent s.>ur< »• ot t onflu t b(»t\\een th<» western and eastern 
• parts oi tli^> I nited States during the course of the nineteenth ( entury 

\vas , 

1 77ie issue of ( urn^nc v mfUition 

2 The r(»rtul<ition of monopoIuN 
t Internal improvements 

4 Isolatirmism vs internation<ilism 

^5 Immn^ration ^ 

Abihtv to Exphnu or Underst<mdin>! of Relationships 

If '1 P>o^ < of lead susp(>ndod from one arm of a beam balance is bal- 
am ed with a pun v of wr>od suspended from the other arm. why is the 
balani e lost if the system is plat ed in a varuum? " 
! The mass of the wood exreeds th(f mass of the lead. 
2 I he air merts <i >>rc<iter buoyant fort e on the le<id th<in on tho 
* Wood 

J rhe attrai tion of >;r<ivitv is greater for Ihti lead than for the wood 
when both <irr* in a varuum. * 
^ 4 Th(* wood disp/ores mr>re wr than (hp load ^ 

Should merchants and middlemen b<» rjassified prci^iurers or non» 
prothicers' Whv' "-^ 

1 As honproduf ers. bee <iuse they make their living off producers artd 
I c>nsumi»rs ' - ^ 

2 As producers, because they <iro re«uhitors and deterqiiners of 
prue * 

I As pfof hi c f . Ts . hvawsf^ tlw\ aid in the distribution of noods and 

hiwu prnduref and (onsumer to^jether 
4 As protluciTs. be(<uise they "^assist in the circulation of money 

Ability to Calculate or Numerical Problems 

If the r<ulius of the earth were increased by three feet. Its circumfer- 
_ en«*e at the equat(»r would nc roased bv about how ^murh? 

1 foet 3. 10 (cot 

2. 12 leel 4. 28 feet . 

What ts the stamKird deviation of this set of five measures— I. ?,3.4.5? 
1. 1 4 \\0 

2 \2 5. None of these 
10 - • , 



•M n.ns! ff ni Hhihn:^ > Ruh« fi I. fh.l fctMDiUU of EducttiootI Me«turt«tot c 1Q72. pp |U- 
IM R prtnti'il wilh PMrmissii.rf (,f Pri>nfMe H«U Inr • 



Er|c . 36 



r 



forty 



Abilil\ toPretiK l or What is Likely to Happen 
' UiulerSpodfietl Conditions^ 

If an (>!p(iru refnserator is oporatpd with th(» door opon in a P^'^^Qf tlv 
insulated MMlrd room, ^hat will happen to the temperature* of the 

room? ' 
1. It will rise slowlv 
''"^ It will remain constant • ^ " 

3 It will drop slowly. » • * - 

4 It will drop rapidly. , i ■ u i.i i,nKi 
What would happen if the terminals^ of aiv^rdinarv houi^ehold liRht 

' bulb were c onnected to the terminals of yn automobile storag(3 battery? 

1. The bulb would light to its natural briUianc e. 

2. The bulb would not glow, though somv cuerent would flow (hrougn 
It. 

3. The bulb would explode. 

4. The battery would go dead in a few minutes. 

Ability to Recommend Specific Appropriate Action 
Which of these practices wpuld probably ( ontribute least to reliable 
- grades from essay examinations^ , ' " j-. /»r 

1. Weighting the items ao that the student receives more credit /or 

answering correctly mor^difficult items. 
2 Advance preparation bv the rater of a correct mnswer to each 
question, * t t n 

' ' 3. Cx)rrection of one question at^a time through all papers. 
4, Concealment of student names from the rater. 
-None of thestr is an appropriate resj^se for a multiple-choice tosf 
item in cases-whtTe: 

1 The number of possible responses is hmited to two or Itiree, ^ 

2 The responsos provide absolutely correct or incorrect answers, 
' 3. Arlarge variety of possible res^nses might be given. 

4. Guesj^ng is apt to be a serioujrproblem. 

Ability to Make an EvaJuBtive Judgment 
Whfch one of the following sentences is most-appropriately worded for 
inclusion in an impartial report resisting from en investigation of a 
wage policy in a certain locality^ ... t 

1 . The wages of the worl:ing people a re fixed hy the one businessman 
who is the only large employer in the locality. 

2 Since one employer provider a livelihood for the entire population 
in the locality, he properly determines the wage policy for the lo- 

3 Sm/eone employer controls the labor market in the locality, his 
policy may not be challenged. t\ \\ 

4 In this locality, where there is o|ly one largo employer cif labor, th^ 
wage policy of this^employer is imWy 'he wage policy of the local- 

Whichv)f the following quotations'has most of the characteristics of 
conventional poetry? 
\ -"I never saw a purple cow; 
I never hopV/o see one," 

2. "Announced by all the frumpets of the sky 

*f Arrives the snow and blasts his ramparts high 

3 "Thou -art hlmd and confined 
While I am free for I can see/" 

4 "In purple prose his passions he betrayed 
For verse was difficult. ^ - ^ 
Here he never sus'^ved/' 



lERlC 



The predictive validity of college tests is low. That is, scorM de- 
rived from them do not predict future performance very well. For a 
better understanding of predictive validity, considerable research 

37' 



forty-one 



is ntM!(l(»(l h) d(MermiiH» what majLiiiitiuh* of diffonrnfe botwoen 
sror(»s is sii^nifit ant. Often stu(joiit X with a score i)f 91 will rocoivt* 
an A. wliiio stuflont Y with 89 will m uivo n^. For GPA purposes on 
m'ost c ampuses thc\so translate iiito,4.00 and 3.00. ruspertivelv. It if> 
assumed that stuflont X ran <uid will (mSperform student Y, but the 
e\iden( e that this is trUt* is tenuous. How lart^e must the difforence 
bi« belwetfu the two^l. 5, 10. 20 points or more— befow? the predic- 
tive assumptiiui is substantiated? 

Why IS pref isiun empliasizmJ? Be( ause (JPAs are used in an ex- 
cMptii>nallv pnn ise manner, as. when arbitrary ( ut-off scores are 
set. A 3.50 may entitle a studeii to further consideration for admis- 
sion to a program, while a 3.49 results in categorical rejection. 
Vmlvr th«>se t in umstances {\n\ least that can be striven for is accu- 
r«M V m nii»*jsuremf»ut. 




38 



forty-three 




Grading 



A comprehensive evaluation of student perfomnance 
shiould provide guidance for academic improvement 
but students too often receive scant critical commentary 
on their progress. Letter grading, the most commonly ac- 
cepted form of evaluation, ts particularly susceptible to 
the charge of insufficient feedback to the student. A more 
fundamental grasp of. the options for academic mea- 
surement is the- most direct route to improved grading. 



' . 39 



forty-four 



EvaluXtions should mean providing a great deal of 

information to students about their academic performance- 
strengths, deficiencies and corrective steps to be taken, relative 
standing/and other pertinent details. Blum has observed in this con- 
nection; "It is no secret that students often receive little critical 
commentary on their papers and examinations. The result is that 
the prospects for academic improvement are diminished....'* 

There is this paucity of detailed help for students because evalua- 
tion now tends to mean the assigning of letter symbols for record- 
keeping purposes. The subject of grading is laden with {)rejudices, 
<logmas. and unfounded opinions, atid for many years it has tended 
to provoke very unscholarly pronouncements. It is not a new dilem- 
ma. In 1890. a Virginia institution had a six-point grading scale— op- 
timum, melior, bonus, malus. pej^r, and pessimus. Because the pres- 
ident thought too many mediocre students received the grade of op- 
timus. the scale was changed to a three-point one— distinguished, 
approved, and disapproved. Soon, however, the president was dis- 
contented again, for **some bad scholars were approved, and good 
scholars were all distinguished" (Cureton). 

The purpose in mentionmg letter grading is to stimulate scholarly 
attention to the subject. Such attention is imperative if progress is to 
be 'made. Our discussion of the unresolved issues associated with 
the assfgning of grades is followed by syme tentative suggestions for 
improvement. 

One reason some of the problems here are not yet being resolved 
is the fact that several assumptions have not been examined except 
t)y a few specialists. Another is the widespread and comfortable be- 
lief inside and outside academe that letter grades have considerable 



ERIC • - 40 



forty-fiv(! 



pn-cin tivr valnhlv In frutli. tl.rv du not M( ClelLind Ims summer- 
i/f;(l (iat.i ahoul iIh; proflu tiv(» valur of jLjr.iilrs. 

I Ij^sr^vli'us! u<M>Ji tM(!uh»rtr{ ilt tfi,»f till V svxt«>jn.jtjijn\ h,<\<M|}-. 
sfimi' t "ir 



In ,1 n'(Hil surv<»v r)f studios ulxmt grades, Warron found that 
about half {)f appro\ini<itolv 2Q0 artirles, papors, and rt?p(*rts that 
appjMrod hptvvren 19r>5 and 197/) dealt with tho form of grades (A.B. 
< \BJ'\ P.F'. pt( ) and with gradi^s as ji^odictive measurers. The other 
half were ronrerned with a varit?tv of aspec ts, such as presumed 
idviintages and disadvantages. Warren ( oniduded; 'These reports, 
in spit»^ of [heir varietv. leave large gaps in our knowledge about 
grades anil grading.... These results do not ronstitute an impressive 
ailvanre in knowledge ahnut an important, uhi(|uitous process in 
hH!h(»r ediu atinn.. " 



There is this paucity of detailed help for students 
because evaluation now tends to mean the assijgning 
of letter symbols for record-keeping p irposes. 
The subject of grading is ladep with prejudices, 
dogmas, and unfoufndec) opinions, and for many years 
it has tended to provoke very 
' unsnholarly pronouncements. 



Problems 

SingL cours(? grades are used to compare students within an insti- 
tivtion and across institutions. If measurements are the basis of a 
comparison, no twb of anything, let alone the learning of two people, 
can ho compared unless the same instrument is used for both mea- 
surements. Woe be to the cabinetmaker who-^tries to assemble 
pieces of rare and exotic wood some of which he has measured with 
a giveaway yardstick and others with a finely calibrated meter 
stick. For physical, measurements, jof course, there are many 
agreed-upon scales'or units— inch.- yard. mile, ounce, pound, ton. 
Each of these can be determined precisely so that two^pr more mea- 
, surejmnnts in the same units Hend to have quite exact meaning. A 
pound on the West Coast has the same meaning ds a pouxid^on the 
East Coast. Perhaps the basic problem in grading students for pur- 
poses of comparison is the absence of any such agreed-upon mea- 
surements.* V* 

A second problem is inherent in the uncritical acceptance, of 
norm-referenced grading, or what students refer to as **grading on 

Er|c 41 



fortysix 

thf i urv»»." Ilub Hhiv li<iv»» < onii- into extensive' use'l)0( aus(> of tlu» 
nred to ( omppusato lor th»» In k a measunnfj: unit. At anv rate. . 
n(»fm-n»fpr(Mi( iMi i;rafin»t4 lit'fives from th»' mvthi( al ■'normal ( urv(» 
ot (hstribution" or bell-shap*Hi dirve. Its pervasive aiui often (iis- 
torle'* applications liavi* ( reat(»(i an illusion 'of the (existence ^of a 
stan«l.ini bv vvhu h students ( an he (ompared equitahK . first hv the 
protessor who assigns the s\mhul and then by aU others whtrsoe it. 
In fart, the " normal (urve" is nothinj^ more than a mnthematiral 
mIimI or niodt^L MiireiJver. m ( or(iiniL» to Lindquist, fhoro is an orro- 
n»Mius beht^i th<it mental abiht\ ttjst data huve betm shown to f(»rm 
the bell-shapcrd curve. The Overlooked fallacy is that main stan- 
dardi/ed tests are i onstru( ted deliborately so that the srore will 
Mr!d .su< h a ( urve. lu som»- ( ases fox\ statistic ians maiiipulaio'the* 
s( ores 

!hr potrm\ ot the false st<inclani,is illustrated b\ this episode 
a)n-ssi*l) 

{i. .IT.- lij.i'.rrMH \hv .if. isinh was m.uli' l«* tiun rn^Jim-i'nn^ Mu- 
<1. 1,? irf ' h\u^ ou tMsiK m{ puniims '.'i.^iU-s Ouv profi>sM»r. nut 

lun! tv* i>i\M \ ^ m .ill \trvi nUu^ m.ithrm.itM s i wurses Ahh.iu«h re< - 
« '^rri/tTu: tli.^f this u.i^ in uiuimmIU v:o(i<i ^r»mp on the ftrst ('Vimina. 
tinit hiM-iu -aupuitiith* >uMMUhstrilni»innt>f «r,uh»sjrum A to F W 
f .M. titui <»i 111** students tort v(\ hint to rec onsidfT Tho ^rad»>s ,it lhi» 
t.t,a ot tlH' trrm ^\xuxuh\ 40 p^-n rn\ A's. p*T( *>ni BN. ami !0 p^-n t'nt 
c . Kn^-uim: th*> i M»>r of tin- stutionts. tho protfssor sitti h»u1c1 not 
hfiuu hinisrlt tn rrpnrt .i liiMnlmtmn of ^ir-ith-s m ^^hl^h .ilm<iM fVor\ 

stU<tl'l!t VVHuM Im* VUI'U .Ui \ 
s 

Ibis prnfj.ssor thouj^ht be had firm reb)renc o points for sottini: cut- 
otl scones lor each ^rafle. 

It IS bad c»iiouj^h when a lone professor grades on the curve for a 
smjilc* class ol hi^blv rnipablQ students. It is even worse when ^i 
i^iiied studenfbodv is judged in this manner. Reed College has estab- 
iishfd grad;' guidelines for all faculty to follow (Levine and A/Vein- 
garl). For frc^shmen the distribution is supposed to be A. 15 percent; 
F^. 15 \)VT( cnt: C. 40 perc:ent: D. 10 percent. For the remaining three 
c atc^gories of students, the recommended distribution is A. 15 per- 
cent: B 45 pc^ncint: C. 35 percent; D, 5 percent. Needless to say, 
sue h grachng c an c ause talented students to encounter difficulties in 
bfung admitt(ui to graduate a,nd professional schools. In the final 
analvsi**. grading on the curve means statistical relativism; students 
;lre' nink*(>rdered from high to low. 

Gradt* point averages are also used to compare students within 
an institution and across institutions. Basic errors in testing and ' 
grading are compounded by the numerous ways in which CPAs are 
cVimput(»d at clifferent institutions. In one survey of these practices 
(CoUins aiui Nickel) from a sample composed of 650 public and pri- 
vate two- and four-year institutions in the 50 states andihe District 
of Columbia, with 448 schools responding, great variation was found 
(see table on next page). 

Thfj survey revealed that in some schools such grades as Incom- 
pletes immediately -become F\s for calculation purposes, while in 
others more than an entire term can elapse before such academic 
capital puni>^hment is applied.^As one example of "sudden death." 
during experimc^ntal investigation of instruction at the University of 

EMC H2 



forty-seven 

h»Vis M Austin (Stic (») it was nor ossary for students to roroivo In- 
('uniplet(?s if tlifj^ (i(»siro(l. During ono term 26 porront did so. None 
<if theMuvfstigator^ knew of th(j policy that I's became F\s for GPA 
purposps nor did several staff members in the registrar's office. 
Sever<il good stmlents lost scholarships and others failed to receive 
invitations to honor soci(jties. More than likely the calculation prac- 
tices are not sp(j( ified on very many transcripts., 

I iui issumption that single grades have common reference points 
has hfHMi mad(> cdtout CPAs. too. Who knows.what sorts of tests are 
behuid lhr> grades or the standards by which the grades were de- 
rived? If anything. GPA statistics as they are presently employed 
tond to be mcjaningless-despite what most academicians and 
others think. ^ 

Our nnm(jrous dciliberations about grading led repeatedly back to 
sf»vf»ral l)asi( fads: (1) Unidimensional symbols report multidimen- 
sioucil phenomrMia. A giWn gnule can reflect level of knowledge at- 



All unfli's r*'t (*t\v(\ iu\i\\ « nurses 
t.d*'u at .Jiu Mislitution ar(» used 
in lonijmhrjK' tin- ovf^rali ^ra<]<» 

ptMUt aVlTK'!' 

(U,\\ i:ra<lcs ui courses \\hi<h 
< ount f(*if thi' <i<'i:rf'e arj» usi»ri hi 

Onh ur-nh's m » ourseK (ak«>n ui 
th*^ inshtutHin ddin^ the fom'put- 
4*t\i ari' us(.(i m fon\imtmi^ the 
(AW 

When a <r)ursi» is rep(»ated. all 
cra<l»»s or more) an> used 
uheni omputH)^ GPA 

When a tourse is rt»peated, onlv 
the I ist ^rade roi eived is used 
• Hhi:^,omputiii« th*'(iPA 



Number of Institution»i 
Iruh* atinq Thiols Present Prat tire 



t5M 



43 



246 



266 



titudes. procrastination, interest or lack of it, and other factors. The 
lone symbol spec ifies none of these things. Perhaps each professor- 
assumes that every other interpreter will see in the lone symbol all 
(if the nuances he or she intended. (2) The symbol, by itself, reveals 
nothing about the quaHty of the test or tests through which it has 
been derived* 



Suggestions 

An emerging model of grading is called criterion-referenced. Its 
basic feature is the concept of mastery. If anything, criterion-refer- 
enced grading requires more complete statements of objectives than 
does norm-referenced grading. Tests are designed, then, to deter- 



ERIC 



43 



forty-oight 



mnu. wlu-thor a .sUuh-nt has or has not atta.nod these "»>1« t'ves- 
Tho ruiHcpt of . ntorion-rofcironrod ^racMns has bet>n usod espocial- 
Iv in the K(d!or Phii. (sno Ruskin and Hoss) and in contract grading. 
(Wlulo tlus approach appears to be more and more common there 
s litth. about It in the Hterature.) There are severa excellent refer- 
eul 0 ' for crit.non-reforenc ed «radin«~Popham. Carver, and Ang- 

iff 

" r.r.t.-rion.refcr..ncod grading is used in the emerging competom v- 
hased ( urn. ula. For a digest of its important features in this con- 
toxt (as well as answers to questions th«t are being asked such as. 
What is competence? and How. does the faculty role change in a 
n>mpVten.c curri. ulum,'). see the report bv the Southern Regional 

''''rhisZjthmi of grading certainly has its .place., especially in pro- 
fessional curricula. When it is used for a given cour.se. a notation 

should be made on the transcript to f'>^i';f<>'«V"*''P,[ 'Tr",hi<= 
Finallv. there is the import" for grading of ^he basic theme of this 
volume-imf>r»ved testing or measuremerrt is the /undcimenfal route 



Our numerous deliberations about grading led 
epeatedly back to several basic facts: [l] Unidimensional 
symbols report multidimensional phenomena. A given 

grade can reflect level of knowledge, attitudes, 
procrastination, interest or lack of it. and other factors. 
The lone symbol spebifies none of these things. Perhaps • 
each professor assumes that every other interpreter 
will see in the lone symbol all of the nuances he or 

she intended. [2] The symbol, by itself, re/eals 
nothing about the quality of the test or tests through 
whic'h it has been derived. 



fo imoroved grading. There are no substitutes for clarity ab^ut what 
cmeTI?Jing to accSmphsh in instruction and very careful efforts to 
find out what students have achieved. . 

Etzioni recently suggested that what is needed is open d.9f ,uss.on 
■by departments loading to agreement about grading standards, but 
this would be insufficient. Once again the tip of the 'cejerg would be 
considared vChile its submerged body would be ignored A better so- 
lution would bo open discussions by departments about all facets ol 
testing. A prefessor can no longer go it alone in certifying students 
for society. 



44 



forty-nine 




Lone Efforts Are Not Enough 



Growing external pressures are forcing faculty to take a 
fresh look at student evcfluation. The new consumerism, 
recent legal decisions, and far-reaching social criticism 
will no longer leave matters of grading and testing to the 
private academic preserve. The use of external examiners 
and the establishment of effective campus grievance ar- 
rangements are only two of the ways recommended to 
improve an increasingly nettlesome issue in academic 
life. 



45 



ERIC 



fifty 



[f assessment is not improved from inside the profes- 

sion. then it most surely will be put under*pressure from the outside. 
Traditionally faculty members have etijoyed almost complete auton- 
omy in their leaching performance. Until recently the courts had 
tended to avoid the academic bastions. But now they are beginning 
to intervene, and some observers believe such interventioi) will soon 
accelerate. This has resulted from several trends: an increased 
sophistication of students, a new regard for higher education as a 
social necessity and an individual right, the expansion of civil rights 
protections' by public authority, and—perhaps most important— the 
new ago of majority. 



The Courts Intervene 

One instance of recent court intervention dealt with a lone grade 
(Stofe Exflel. Bartleti v. Pantzer). A political science student gradu- 
ated from the University of Chicago in June 1971 with a Bachelor of 
Arts degree. During the spring quarter of his senior year he had en- 
rolled in a graduate accounting course to^fulfill an admission re- 
quirement of the law school of the University of Montana, where he 
was seeking admission in September 1971. The law school had in- 
formed the student that the requirement would be fulfilled if he re- 
ceived a satisfactory grade. 

The student'received a D in the course, whereupon he, was ad- 
vised by the law school that he would not be admitted because the 
grade was hot a satisfactory one. Testimony in court revealed^that 



46 



fifty-one 



I olloi^os and univorsitius roj^anl^d a grade of D as "acceptable." 
but not "satisfadory." Th(> Supremo Court of Montana wa's unable 
to ilis(:(?rn tin* i?\t|uif>ito difioroncc and directed the law vschool to 
admit au btudont. 

I ,re court mtorvention in matters of academic measurement 
seems likeK in tlio not-too-dist<yit future. The United States Supreme 
Court made a niomentous decision in the Griggs v. Duke Power Com- 
ponv case and may have set a precedent for drastically altered in- 
t(»rpretations of higher education test scores and grade point aver- 
ages. The company was found to have discriminated racially by re- 
r- quiring, for an employee to be promoted from laborer to coal hand- 
ler, either the possession of a high school diploma or the passing of 
two standardized tests. In rendering its decision, the court ruled: 
*'Nothrng in the tict (Civil Rights Act. 1964. Title VII) precludes the 
use of testing or measuring procedures; obviously they are useful. 
What Congress has forbidden is givilig these devices and mechan- 
isms controlling force? unless they are demonstrably a reasonable 
"measurt? of job performance/* 

Suits have been instituted already in several states charging that 
bar examinations discriminate unfairly against minority .groups. 



If assessment is not improved from inside the profession, 
then it most surely will be put under pressure from 
the outside. Traditionally faculty members have 
< enjoyed almost complete autonomy in {heir teaching 
performance. Until recently the courts had (ended to 
avoid the academic bastions. But now they are 
beginning to intervene, and some ob&ervers believe such 
intervention will soon accelerate. 



The fundamental issue is the predictive validity of such tests for all 
who tcdce them. It could well be that these assaults upon bar exams 
are a prelude to assaults on many other licensing examinations, be- 
cause they. too. are job related. Since-'hlgher 6ducation lit its testing 
activities is engaged more in credentialing or rank-ordering stu- 
dents than in assessing learning, it islnot too difficult to foresee 
grade point averages being ruled job-related by^the courts. {Today a 
student may be refused adniissioit to a professional school because 
' of a CPA a feiv hundredths of a point below some arbitrary 6ut-off 
score.) Many ramifications of the Dulce Power Company decision 
andlts innumerable complexities have been examined meticulously 
and thoughtfully by Huff. 

Of more direct portent for the fufure may be the dissenting opin- 
ion of former Justice William O. Douglas in DeFunis v. Odegqard 
(Rields). Justice Douglas was especially critical of scores derived 
from the Law School Admissions Test and of grade point averages 
and the fact that they had dominated the selection process. He ar- 
gued that law schools are not bound to admit students according to 
mechanical criteria because such criteria often conceal important 




47 



CN 



fiftv-two 



abilitii's lustit Dmiijlas u<ts must piirsuasivo in his pliM for niiire 
thorough tissrssiUHul of iiuhvithjal altrihijl(»s than l(;sl si ores pro- 
vi(h». For p\aniph\ ho mamlaiiuui lh<il a person who pulls hinisoll 
fn»ni theghellu via a ( onininnilv ( ollngo has lionionstroled a (luahlv 
of p(»rsi»VL»rai,ir(» and lhj»rehv has nioro promise* lor tho sludv ot law 
than a ru h i»ra(iualo of Hazard. Th(» poorer applicant shouhi be 
ailnnltcuh said Douglas, because he had shown spei lal potential in 
<*ontrasl to the Harvard graduate who may have tak(»n less advan- 
tage^ of the vastlv superior opportunities afforded hini. 

It IS too soon to know the full iirvpart of the Sfj-called Burkiev 
Amendment that i^iv(?s students arress to their test papers and 
oth(»r offwial records, hut si ores of students may avail themselves 
of the at ( j»s^ and be so overwhelmed that thev will demand careful 
and hon(»st explanations hjr selected test scbros and yjrades. This 
provisnm of law mav^ive them a basis fur court action to enforce 
th(>ir liemands. Quite obviouslv. poor tests and unfair j^radcjs are 
h^atures of instruction that arc? under the direct control of l3aoh indi- 
vidual faculty member. Just how could a student's "improper spirit 
toward the subject niattcir" (Case 9. page 15) be documentdd or sub- 
stantiated in ( ourt? 



Unless professors individually and collectively begin to 
make drastic improvements in testing and grading 
practices, there uill be intrusions on their autonomy 
from without in several forms. There even appears to be 

a possibility of compulsory state or nationwide 
standardized tests of academic achievement. Academic 
freedom is imperative and must be preserved, but the 

professoriate cannot avoid its own respons|biiitJies. 
Grading policies and practices in most unriergradutite 
courses do not be^r any relation to inviolable 
academic freedom. 



What does all this mean? Unless professors individually and col- 
lectively begin to make drastic improvements in testing and grading 
practices, there will be intrusions on their autonomy from without in 
several forms. There even appears to be a possibility of compulsory 
state or nationwide, standardized tests of academic achievement. 
Academic freedom is imperative and must be preserved, but the 
professoriate cannot avoid its own responsibilities. Grading policies 
find prar tices in most undergraduate courses do not bear any rela- 
tion to inviolable academic freedom. 

Hr)w. then, can the process be improved? Classroom tests can be 
improved by faculty members learning ^more about measurement 
and obtaining the assistance of t^eir colleagues. At least three ad- 
ditional reforms must be implemented to improve the test product 
and dc^monstrate the professoriate's willingness to put its hrmse in 
order. 

48 , 



/ fifty-three 
Visiting Examiners 

It is n (leoplv inj»r<iino(l belief throughout American higher edue^a- 
tion that instructing and examining are inseparable. The instructor 
IS supposedly the person best able to judge the work 6T his or her 
stud(mt. 

There has been at least one historical challenge to this assump- 
tion (Coulter). In 18* I the three trustees of the University of Georgia 
were namcKi as visitors and urged, along with other distinguished 
men of the state, to attend examinations of seniors because: *' The 
test ofthe pudding is the taste thereof is a saw honored with age 
and truth. Examination times were tasting times and this tasting 
should be done by more than the cooks only." By 1825 the examina- 
tions for juniors were being attended by any person who desired'to 
attend. 

A modern and refineti counterpart to this practice of some 150 
years ago is the visiting examiners tradition for the Honors Program 
of vSwarthmoro College (Swarthmore College Faculty, 1941), which 
began in the early 1920s, continues to flourish today, and is widely 
acchiinied by fanulty. students, and alumni. 

Around 40 percent of juniors and seniors elect to take honors 
work. Normally this means that a student studies sixfsubjects during 
thp !::^.t two vonrs. The work is pursued independotitly or in small . 
seminars. At tl eend of the senior year the student is subjected to a 
three-hour wrtt^en examination in each subject, ThesWxams are 
prepared and evaluate d by faculty members from other institutions. 
In the oral examinations that foll0^% there is no rigid pattern; they 
are conducted in a variety of ways. But the judgment of .the isitor 
carries the most weight. 

A recent evaluiU«)n of the program (Swarthmore College. 1967) 
desf ribes the rationale and the benefits succinctly: 



M.uK r\trrn.^h«x.innn»Ts,. think the s\sti'm v\<irks woll and the i^vam- 
nuTs' #«v<ihi-itinns of students nre jienrTailv ronsistent with the f<ir ul- 
tv\ \Uvi\ >:raduatesofhonors have said (in the alumni quostionnairfl). 
ash.ive maiiv far nltv. that the syslemhulps to cruatoan atmosphere of 
KKnU\-Mndent collaboration.,. These are now (onventional state- 
mrrit^. hut v\t' arc* im lined to a^ree with Ihem, The rol!ea«ueship and 
!h»- lotelhN tiial rhei ks firovided by external evaminers ijre widely felt 
to Xw valual)le fur both stmhrnts and the faniltv; manv of the latter. 
espiM i.illy. set store \\\ ii.... 

On all too many campuses faculty and students are two factions 
warring over learning. The faculty are so dedicated tWhe exercise 
of their selective function, they cannot see teaching-ldbrning as a * 
collaborative endeavor, whereas at Swarthmore apparently faculty 
members and students work together to meet and impress a sort of 
common fno. the visiting examiner. Thus one reason for more exten- 
sive use of this type of program is that it serves the cause of learning 
for the individual students who participate. 

A second reason for having visiting examiners on many campuses 
is that their presence should broaden the perspectives of faculties 
about the art and techniques of teaching. Whili? the various faculty 
organizations help keep.the professoriate abreast of disciplinary de- 
velopments, many pay little direct attention to good teaching. With- 



49 



nftv-four 
• > 

out suffu ionl stmiulalum it is very edsv to boromo ^mug. myopic, 
and provincial If, over a substantial pciriod of time, too many stu- 
dents pfjrformcd poorly- the visilinji oxaminors would be in a posi- 
tion lo ask some p(»nelralinR questions^)f the home faculty. Help by 
coll(»a«iies from utiuir institutions is more useful and more palatable 
than mterh^rence from thost*' outside at^adernic lifil. 



Tesftng Specialists 



Another rh,dhMi«e to the notion that teaching and testing are insep- 
arable ( ame (iurin«4lvJ t^rlv I Ws at the University of Chicago with 
the rreation of the Board of pAaminations (Bloom). The faculty were 



Recently, perhaps partly as a result of the joint 
statement, grievance procedures have been made formal 
in some institutioits^nd often include a specially 
appointed committee, which in some cases is given the 
authority to overrule a faculty member and change a 
grade. For example, at California State University, 
Los Angeles, if a grade grievance is not resolVea at the 
departmental level, the student may appeal to the dean 
of thai school who. In turn, refers the matter to a ppecial 
' committee. The dean, after consultation with the 
committee, may authorize a change of grade. Ifjor any 

reason a student believes the problem has not been 
resolved fairly, he or she may submit a signed statement 

to the standing student grievance committee, which 
may refer the issue to one of several other committees, 
any one of which may recommend a grade change to the 
appropriate dean, whereupon the changlf is made in % 
the permanent records. 



ERIC 



concerned primarily with having students assume responsibility for 
their own learning. Degree requirements were set in terms of com- 
prehensive examinatfons. and as a result students could jnake indi- 
^ vidual decisions about the speed with which they would attain their 
degrees as well as about theij study methods and class attendance. 

Since the comprehensive examinations were the sole basis for 
meeting graduation requirements, they had to be excellent mea- 
sures of academic achievement. In consultation with facilities, a 
corps of test specialists constructed the e?^ms. scored thdm, and 
assigned grades. The faculty believed that an ideal teacher-student 
relationship— one which promoted an optimum of learning— was 
impossible when thp teacher* also served as judge and jury. The 
success of the.project was revealed, in part, by the high test relia- 

i 

50 



fiftv'fivu 



lnh\\ i (M-lfu u'lits wvvr obtaintMl. Tln»se ran^jed almust without 
♦'M rphnii b(»tv\i'PiT <^K) and 95. . 
% Si'vrra! Inr< rs ( oiabiiUMl (iunn^ the Orirlv 1950s to elinr mate this 
♦'\tr«*nic (irpartiirt' Iron* tnichtional t(»stin^ and grading prat tices. 
hi the nuMnrinus srvoial ( ampus(»s hav(? a^tabhshed oft'ces that 
s*TVP mstrui tors on i voluntarv or request basis. One exiinple is 
the lAaluatH»ii an<l lAainination Servii e of the University of Iowa 
(Uhilnevt. rh«» servn e stafi ('onsults with individual faculty mem- 
bers or (iepartnienls on techniques of test construction and improve- 
ment, test and item analysis, and methods of grade ^ssignmont. In 
addition, i ourse (laminations an? duplic at<)d, scored, and analyzed, 
f he servic e kecrps the fa< ully and others infwmed periodically by 
' means ol memos and tin hmr al bulletins. A current memo is envitled, 
• Should I Take the Graduate Record (GRE) Aga^n?" Recent bulletins 
dis( usseci "improving Essay Questions.'* There are two professional 
members of ihv staff, about 40 p(ir( ejnt of a faculty of 700gjse the 
servM e Ctimparable agen):u?s should be available to f^iculties on all 
( amfnises ' ^ 

in (Juinu^^'s first fat ulty policy paper on professional develop- 
ment, the authors, hi a chripler entitled "Evaluation for What?". 
suLjgest the i(h»al of the separation oi teacher from evaluator: "A de- 
v»*lopmenlal approa( h t(» edut atnm f alls for a i|ew kind of detach- 
ment h<r sfud(»nts th(? detar:hment o^ the process of learning from 
the( ertrfu ation of compcjtf^iice; and for teachers, detachment of ef- 
forts to impro\ e tear hint^ from offu lal assessments of performance** 
(Group tor Human Development in Higher Education). 



Academic Grievances Committees 

Trathtion has it that if'a student feels a gradoMs an improper one. he 
or she inav s{M;k redress by consulting the individual faculty 
memlxT. If satisfaction is not receivedr the studpnt has had the right 
to ( onsult with other individuals— department heads, deans, and 
ev(»n the president or ch^incellor. For the most part the arrange- 
ments have been informal and final authority to change or not 
( hange thf^. «rade has rested with the faculty member. > 

In \ {)V}7 several important organizations* issued a Joint Statement 
i»n Rightv and Frr^edoms of Students. The statement included this 
right, 'pT(itv(m>n Against Improper Academic Evaluation— Stu- 
dents shouhl have protection through orderly procedures against 
prejudiced or i apricious academic evaluation. At the same time, 
tlu^y are responsible fur maintaining standards of aca'demic per- 
fnrmaiK e established for each coubse in which they are enrolled." 

Re( ently. ptjrhaps partly as a result of the joint statement, griev- 
an( e procedures havtfbeen made fofmal in some institutions and of- 
ten m hide a spec ially appointed committee, which in some cases is 
given the authority to overrule a faculty member and chang6 a 
grade. For example, at Galifornia State University. Los Angeles, if a 
grade grievance is not resolved the departmental level, the stu- 
dent may appeal to the dean of that school who, in turn, refers the 



■ ♦ . ' wMt. . >'f 1 tuv»r I'v Trofrss-ir** (. s Ndbun il student Ai^iKiMlion A9'«ori«tian of 



ERIC 



51 



fiftvsix 



mkiPT to a ^pencil ( (mmnllop. The dean, after consultation with the 
committor, niav authori'/o a rhan^o of grade. If for any rpason a stu- 
dont hehovos the problom has not been rosolved fairly, he or she 
may subniit'a sighed statement' to the standing student grievance 
committee, which may ref(»r the issue to one of sev.eral other com- 
mittees. an\ on(» of \^^hich ina\ nicommend a grade change to the ap- 
propriate deiin. whereupt)n.jhe change is made in the permanent 
re( ords. ^ 4 

- At Western Michigan University, the arrangements are less com- 
plicated If a student is dissfUisfied following informal consultation ^ 
within the department, he or she may see an administrator, who 
may decide the grievance is unwarranted or there is sufficient evi- 
dence for the case to be considered by a committee on academic 
fairnpss, either the gr.Kluate or tha undergraduate committee. The . 
undergraduate committee consists of three faculty members, three 
undergraduates, and a nonvoting chairperson. If the committee de- 
cides to recommend a change of grade, the faculty member is in- 
formed first so that he or she may make the change. If the faculty 
member prefers not to do so, the committee then makes the change 
by notifying the dean of records and admissions. 

At Pomona College, the pi^ocedures are simple and straightfor- 
ward. If, after the usual informal hearings, the disputants are jmll 
disgruntled, the\iean appoints a small ad hoc committee of faculty 
from the department of the instructor or from a related department. 
"The decision (& the hearing committee on the disputed grade shall 

be final." • ^ ■ 

' There are formal hearing procedures in other institutions, but in 
these the final judge— whether a committee, a dean, or a chancel- 
lor—has no power to change a grade. Appeals for fairness can be 
addressed ty the faculty member, but not a decision that a grade 
must be changed After going to elaborate lengths to ensure aca- 
demic rights for students, Michigan State University (1969) persis- 
tently maintains the trjaditional stance that the instructor is the only 
person who can assign a grade. In most instances instructors are 
cooperative, but nothing further can be done if they stubbornly defy 
the grievance committee, according to an official. 

We recommend that formal arrangements be established for re- 
conciling testing and grading grievances and that a final judge other 
than the instructor have the authority to change a grade. This rec- 
ommendation IS made for these reasons: 

' (1) Cases such as some of those mentioned in the first chaptei: re- 
flect almost unbelievable examples of faculty arbitrariness and 
capriciousness. Students should be able to fight back against such 
unfairness", and with the balance of power on their side. This pre- 
sumes our basic system of justice, which is designed to protect the 
rights of the weak individual who is being persecuted by strong 
external authorities. 

(2) The mere existence of such appeal arrangements should help 
decrease testing and grading offenses. 

(3) Correction by one's peers is both more palatable and more ef- 
fective than intrusion by outside forces. 



52 

ERIC 



\ 



fifty-seven 




For Further Reading 



For a more comprehensive understanding of testing and 
evaluation, faculty have access to a number of excellent 
source documents. Here are some of the best. 



53 

ERIC 



fifty-eight 



/ 



References 



Adkins. Dorothy WockL 7Vsr Coriitt ruc- 
tion I)eveUy[ment and Int(*rpretatton 
of Achievvmvnt Tests Columbus; 
( harles Merrill. 1960. 

\ng(»ff. William H. "Criterion- Referenc- 
ing, Norm-Referencing, and the SAT/' 
Colic fic Board Revmr, no. 92 (1974), 
p. -x 

Bloom. Benjamin S. "Changing Concep- 
tion of Kxamining at the University of 
Chicago/' In Evaluation in Generalf^ 
Fc/urafjon, edited by Paul L. DresyieK 
£)ubu()ue. Iowa: William C. Brown 
Co. mu 

Blum. Paul Voji. "Needed: A Code of 
Kthics for Teachers." Chronicle of 
Higher Hducationi October 21. 1974, 
p. 20. 

California State University. ''Student 
. Information," Unpublished. Los 
Angeles. 

Car\fr, Ronald P. 'Two Dimensions of 

Te*iting. Psychometric ami Edume* 

trie " American Pnychohf^ist^ 29 

(1974); .512-518. 
Collins. Janet E. and Nickel. K.' N. 
trading. Recording and Averaging 

Practice^ in Higher Education. ' 

Mimeographed. Wichita, Kansas: 

Wichita Stiite University, 1974. 
Coult*^. E. M. College Life in the Old 

South 2d ed, Athens: The University 

of Georgia Prens. 1951. 
Cureton. Louise W* "The History of 

(trading Practices " Sfeasuremcfit in 

Kducatum, no. 4 (1971). pp. 1«8. 
Defunis V Odegaard. 416 U.S. »12. 94 S. 

Ct. 1704 (1974). 
Dressel. Paul I*. Evaluation in Higher 

Education. Boston. Houghton Mifflin, 

1961 



Ebel, Robert L 'Essentials in Education* 
al ^feasurement. Englewood Cliffs. 
N.J,: Prentice-HaH. 1972. 

. "Writing the Test lien).' In 

Educational Measitremtnt, edited by 
E, F. Lindqukst. Washington, D.C.: 
American Council ofc Education. 1966. 

Euioni. Amitai, *'iSrade Inflation: 
Neither Freedom nor Pisciplinc/' Hu- 
man Behavior. October 1975, p. 11. 

Felker, Daniel B,. and D«pra- Richard A. 

"Effects of Question Type and Ques 
' tion Placement on Problem-Solving 
Ability from Prose Material.** Journal 
of Educational Psychology 67 (1975): 
380-384. 

Fields, *Cheryl M. Chronicle of Higher 
Education. April 29, 1973, p. 1. 

Griggs v» Duke Power Company, 401 
U.S. 42411971). 

Group^for Human Development in High- 
er Education. Faculty Development in 
a Time of Retrenchment. New.Ro- 
chelle. N.Y.: Change Magazine, 197f ' 

Hechinger, Fre<f M. "An « Academic 
Counter-Revoiution.*' Saturday Re' 
view/World, no. 131 (1974). pp. 63;68. 

Hpfstadter, R. Anti'InteUectualism in 
American Life. New York: VinUge 
^ Books, 1966. 

niuff. Sheila, "Credentialing by TesU or 
by Degrees: Title VII of the Civil 
Rights Act and Griggs v. Duke Power 
Company.'* Harvard Educational Re* 
vieu\ no. 2 (1974). 

I>evine, Arthur and Weingart, John. Re- 
form of Undergraduate Education. 
San Francisco: Jossey-Bass, 1973. 

Lindquist, E. F^A Fir^t Course ih Sta' 
tistics: Their Use and Interpretdtion 
in Education and Psychology. Boston: 
Houghton Mifflin, 1942. 



54 



fifty-nine 



i^ii'irni Jnsfrii, ti,,n,i! Infy nf 

MmIu.iI 1 /Jut atiMfi ' Vrruttiiif]^^ 
(>!«i and N» u l>p«'s of I xjiiima 

VnphAfu \\ Limes, mI ( 'r(t, riyn lif fi r 
i*r.,i:r,j:j,, ji,f (\.un^* (ifii^ awl 1\ tthta 



Southt rji li4 i^\^>n.i\ IviuiaUon Hoard 
J * jriufij' ^ <»iif ( H( - lU^^.^ti^i! 

SLihiaktr Job \ Thv INs.u Tvik- <»i 
lA.inun.jtiM! In I-^luj utuffutf \f*u 
\urtTfu-rtf uljUtt h\ K V I-»lui'|Ur>l 

l'S|.rro|f(t .it flu I n!\irsitvs>t T»Ads 
'ii '\nstm /*S/ sA'f/rr, no >i 

S\\anhmop' Coll.v-"' '('rai(|tn' nt a Col- 
* Sw-*trlh;norr. i\i I'Mu 

Su.irChmon- CoJIv^:* F,nult\ An Atl 
itniun in Kdtuutum %v. ><)rk 
Matmillan. IDU 

Thomas. I, ami VuK^tt'in, S An F.xfHm 
frtt'ntal \pimta*h ta Lmrnin^ Fram 
\\ nth n Matt rial 0\hni\^y\ Kn^'laml 
( « nir«' for (hr Studv of lium<in l^earn- 
in^?. lirum l Uni^tTsru. pjTO 

Afx Own wu Washmi^ton. 

KHK .rhMringhoust. <»ri Hi^'hor Kdu- 

ralion, 

UfstiTn M ohi>^an Uni\tTsH> " Student 
Vadrniti Uif^'hts l^ilicifs and IVoo- 
durr. rnpuhhshwi Kalamj7o(». 

>fasPH'« ,x, Na/ma and liarkor. Donald 
if \ llalt (Vntury of lit-v.-aroh on 
I ssai T<*slin^ " Im proline ('(ylUfH' 
ami I fur,-r^if\ T' nhifi^,no I ( 197:J| 



sixty 

0 



Suggested Readings 



American Psychological Association. Standards for Educational and Psycholog- 
ical T^sf5 Washmgton. D.C.: American Psychological Association. 1974. » 
This monograph was develop<^d bv a joint committee of members from the A'"^';^^;^" 
Psvchologfcal Association, the American Kducalional Research Association and the 
National Council on Measurement in Education. The contents are directed to both 
d^t^vefoi^rs Ond users of standardized tests. -Es^sential." "very desirable, and de- 
sirable" considerations Yibout tests are proposed - 

Anderson." Scarvia. Ball. Samuel. Murphy. Richard T.. and Associates. Encycfo- 

pedia of Educational Evaluation. San Francisco: Jossey-Bass. 1975. 
This is one of the first detailed.reference works on concepts and techniques f^r evalu- 
ating elucation and training programs. It is not limited in scope ^.SW^.f <!,^"^ 
versities. The articles-^alphabetically arranged from "accountability to van- 
ance-' ^are written by specialists. Each article is extensively cross referer^ed and is 
followed bv selected sources. The articles cover 11 topics: evaluation models: func- 
tions and targets of evaluation, program objectives and standards; social context of 
evaluation: planning and design, systems technologies variables: '"^^^"j;;.*".^ "^,^^P' 
prodches and typesftechnical measurement considerations: reactive concern: analy- 
sis and interpretation. ^ 

Bowen Howard R.. od. New Directions for Institutional Research: Evalmting 
Institutions for Accountability. No. h San Francisco: Jossey-Bass. bpnng 

As the title implies, this booklet is about program evaluation 1 he seven papers, prt"- 
pared especially for this volume by six authorities. de<»! with the various complexities . 
of assessment and offer suggestions for resolving them. ^ 

Bmning. J. L. and KinU. B. L. Computational Handbook of Statistics. Glen- 

view, 111.: Scott, Foresman and Co.. 1968. 
This is an excellent "cookbr>ok" of sUUistical methods, clear and concise in its preseri- 
tation of the step^ necessary to compute the basic measurement statistics mentioned 
in The Tesn>i>f and Oradttif! of Students' 

Buros 0$car K.. ed. Ue Seventh Menial Measurements Yearbook. Highland 

Paric. N.J.: Gryphon Press. 1972. 
This work is in two volumes that have a toUl of slightly more than 2.000 pages. More 
»Lan 1 iOu published tests (achievement, attitude, personality, and others) are listed, 
' ' wJthsome 12.000 references. For approximaU^y half of the tests, there are ong- 



along ^ 



ERIC 



56 



ERIC 



sixtv-onii 



Ih» M' w»tttrTi» s ar» uulisp, n^,»Mi ssh* n ^♦ lu nn^: a vt.t{Kl<triii/*'tl ti-sj t«»r .•^^h(T i Lis- 

♦ 1 * 

Dn'sM'I. Paul L. ami AssociaU^s Hniluation m Higher h'duaition, Boston, 
Houghton Miftlin. mw. \ 

I Itis js ot th» l< u !n»oks m this Ut'k\ \ n .tni^ tl dtn i (K to mlK'^i ami uiuverMtv fat 
ulfv mtinlMTs Thus thr hvtl t»t dis<»>nrsj )s nuin- api>ni|»rKUf lh«in that in main' 
i»Tht ( ToOm's an<i »'vainj»!» >s ot t<'si\<|4u sUu^ris to hv ^uxU- l>rar£uai (H the l.l (h,ij)' 
i«rs ,il]wnu«'nh\ (httiTMH authonUt s, U)de«iU-\j>hiui\ with Uu- issues disuisscti in 
hi h ^'ifniand (iratlina 1/ Sfu^U nr^ Four t>t thvm <iTv v^iui lalh jirrliiu-nt *»\alu«t* 
u»Ki m thr s«»ual M H nu's <'\ahKiU<»ri in th» natural su» nu«s <'valuatiun m ih<' hu- 
nian»rits .ind rv.ilua!inri nt r»»i>utMjni( riiiun sKiHs 

VAx'l lioix'Ti L I'^stntiah of Kduvatuma^ hkusurement En£):lew(H>d Cliffs, 

Tni-^ !mm,K a r»\isui v» rsj«)ri fh«' author ^ I 'Mi J Mfd^unnjt^ fAhmttutruiI A(hn( r- 
f'.. is st)unii n.nlaKIi arul Ji is rihTr»'<i lu rcjX'alHilv UirouKhout ihv 

Ur^i fhriM « h<ipt»'rs of ilu pn'scut \\»)rk. ami man> points (»nlv touched on here arr 
> !< arj/ i la!»«)rah'd rheri'in Thv L'J ihai>tt'rs an- si«j)arat.'d into' five catoj^oru-s Part 
! lhsfor> ami Philosnphv . Part 11 ( Ussr<«»ni "IVst I Uvelopnieni. Part lH-^CJet' 
!ir.;^ lntt rpre(jnK'> ami I'sin^ iVst S» ons IMrl I\ -'lest \naKsisund Kvaluation. 
r »r» \ Pi4l»Ijsh» d I'rsts Ami T. stin^^ l*nt^'ranjs There 1- a ylossar\ o| the term^ and 
<*ru*pT ^isid in rduratioual me.i-ur*-n)enr 

I.tndquisi. K. ed Educational Measurement, WasHington. D.C.; American 
' rounni i)n Kducation. 195L 

rbi^ UM'fuibook uhuli uent into its sj\th (»rnUinj:m Uhiii, is a eoniprehensne hand- 
hook am! M \i»>uok i>n th»- the(»rv and teihnifjut^ of educational inea^^urenient. All \H 
,frtntes w.er#' esp».( j.illv pu]UiTn\ fi»r the volume h\ noted authoritie.s, M«in\ of the 
sele< hf.ns uht« h are ^rttujieil into three rate^ortes. The Kunetions of Measiireirenl 
111 KdutaHon, The ( npstrut ti<4n of Achievement Tests, and Measurement Th<-ory - 
;ir«* of a \er\ pra« tn al nature, and all instructors lan tind/^ood tij^s here for te^tlnK. 

Maj^er. RoImti F. (ioal Anaiyms. Belmont, Ca.: Fearon. 1972. 
Ma^yr's uork merits tonsiderahle attentKm. His v.riUi\^ is clear and easilv under- 
sio^ja- ht M»mtorf<ihlv translate s his th.ory into apnhcatmn. Goal Aruxh sis is a small 
honk il,t, juf^esi that ^pelN out the stej.s bv ;vhich mstruclors can identify goal*^ in 
their m-.trui t ion and estaldi^h the appropriate stejts toward the successful comple- 
tion t,t those^MKiK \ss«.ssnjent and evaluation are.hoth huill into the j:oa I -analysis 
priM rdure 1 he h<K»k detmes proudures that allow inwiructors to sav where Ihev'aro 
where ihev Want to ^'o. how tliev mtemi to >^el there. andW thev'know when rhev 
art- tin u " " • 

Ma^tT. Robert R Mca^^untif: Instructional Intent. Belmont. Ca,r Fearon. 1973. 
Uriffli/; in his unu^ue. mt.jrmal styh . the author dest ril)esand illustrates a jJrocodure 
that will help in srl, uin^or creating,' test items that will match oliiectives Illustra- 
ti»»ns lover a wide array »d performances 

Mager^{ol>ert F. IWimrinf: Instructional Objectives, 2d ed. Belmont, Ca.: • 
reamn. I97r). ' 

Whfh the rontenis ot tills UMik s«>rm de*ei»ii\el\ simi)Ie the sub^ttanc^ is profmind. 
esneeialK ff»r th»»se who have ^;iv*'n /ilm*»st no thought to objeclivcH The book js cley- 
eriy and wittiK written He^inners m the atadomic enterprise will benefit Kreatlv. 
oJd-limers mi;.:hi 

Mehrens. William A. and HtK'l Robert L. eds. Principles of Educational and 
P^ycholo^cal Mea^surement. A Book of, Selected Readings. Chicago: Rand 
McNally. 19fi7. 

Ihis bmtk c<uuains dassn al artu les on measurements, most of them very tochmcal 
and staf|sti(«il, vtUn h wen> published <ivi;r a span of M) years. The M sofections are 
^^roujK'd ihlo fi\e iate>:ories ine<!sureinent theory and Vealin^;. norms, reliability, 
lahditv, iten^ amdvsi^ and selection, 

Paco. C I{olx?n. cd. Xeiv Direction'^ for Higher Education: Evaluating Learning 

and leaching. No. 4. San Francisco: Jossey-Bass, 197S. 
i:.ich chapter w as prejjared especially for this booklet by authors with widely varyint: 
IhTs|M'cti\es The SIX papers collectively demonstrate how complex problems of eyaF 

9„ 57 



sixty-two 



ua^Mii ,irr ami fhr inmiinrrabU' H»rs Cu ho lon-idiTi"! and an- usrlul as <i ijuuk hul 

Thorndike. Hc)l>ert 1,., t<l Educational McasuremenL 2d ed. WashingU)n, D.C.: 

AiTHTuan Council on Kducatjon, 1971. 
ni» Jir-t edition oijhiv htM>k \\<'nt throu^'h seven prmUni^s This si.cond edition, im- 
par. d with ihi-assistano't>t the Aniernan Kdiuational Keshan h AsMinalion and the 
\m»Ti<an 1 ouridl un KiiwMmi, reflects the broadened concern about e\aliiation 
\h n h.i- Im'» n il wh^inrx^ The ::(»in.Mes are addressed tt» hmr areas Part One les( 
hf-i^n, ('.instruUion. Administraiion. and PrtnesMnj;, Part Two -hpeoial types oi 
l»'s»s p.irf ihrer Measure mint Theorv , I»art Ktmr- Apphcation ot hMs !«> hclu- 
t.itM.iuti r»r««bl<ms n,nh \ht spinalis! and (he m»\i<e v.iU tind this Ixmk uselul 



Selected Journals 
with Special Emphasis on Evaluation 

\mencan Kduiationa! Ite'^earch .Journal 
British Jtmrnal of Statistical and Mathematical IMchoIo^'v 
t enter for the Study of Kvaluation 
( i»Ue>:e Student .Journal 
- Kduca turned and Psychological Measurement 
Jdurnal of Kducationa! Measurement 
J«iurn<il of iteseareh in Science Teaching 
Pnigramnu'd I.eaniing and Kducaiiona! Technology 
Ps\tluini(lrika • 
Ui'Vieu i»l lahHalional Research ' 



58 



ERIC 



On The Quality 
Of Teaching 



Change 




REF^ORT 
FEACHING 



In coliaboration with the m2i|or discip!inary fields, 
Chmngm is now publishing selected assessments of 
exceptional teaching on a twice yearly basis. 
Three fields of study are surveyed in each semi- 
annual issue. These Reports are made available 
through a grant from the Fund for the Improve- 
ment of Postsecondary Education. Please send 
your requesti on official letterhead, along with $1 
per copy to cover poetige and handling, to Under- 
graduate Teaching Program, Change, NBW Tower, 
New Rochelle, N.Y. 10801. 



ERIC 



59 



Change 



THEmiEOFTHE 

HUMANITIES 




Change . 

THE AMERICAN FUTURE 

Tbe Reserve Amiy 
of the Underemployed 



JamM OTooto 




The One Magazine for Academic People 



r"/j(/rf<.> ihi- first and «.nlv magazinr to for«f exciting now bonds amnnj; academics 
rv«T\ when-. n^«ardlt-s of ihcir field and mtorcsl Each month. Chatifii^ ^ 80.00(Vrc'ad- 
er^ share in M^me of the most challenginR editorial fare available. Chatm' Is the yen- 
turesnmf nlaga/ine of creative ideas, of major essays written by some of America s 
great nunds. and ten regular features each hionth that are vturth the price of sub- 
vcnplinn alone * 

(%vi^»^ nt.l onlv interprets a changing culture. U helps create it For those wlio thnve 
on more than vesterdav s nvv,s, reading Chan^v can be a revealing experience I se 
the handv order form in the front of the book or send for a one year subscriptum lor 
^ Mi to Chan^r Maga/me. MUV Tovier, Si-v. Kochelle, N Y. 10801. . 



Some 1975 editorial highlights 

^ AMEIICAN'A 

Chrittophir Utch on Thi Diaocriltiitlon of Culturi Orlindo PitUrion 
on Elbnlcilv Victor Novotky on Iho Encyclopaedia Britonaico Edwin 
Nowvanon How Acadoalci KUl tbe Eo|Hih Lanfuago lanos Dagaan on 
•lataica Mllford laaiaa Kaal on lha ( ^Qtar for lha Study of Danocrallc 
Inttltutiont 

SOCIAL ISSUES 

laaaa O'Toala oo Tka laiarva Ar^y oi tha Undaraaployad Kannath 
•ouldlni on iLa Maiaia.aat 9' Dacllna llchard Laitaron tha Equal P,y 
■o*ndaf|la Cynthia Sacor on Lashlaaa in Acadania lohn Ejarlon 00 
Ada.t" llchirdwo Say.our MartU "P"! on • Econoiaici 

Daparlaant Marilyn Glttall on tha rallura of AmrHattva Ac(l«a Danlal 
Graanbari 00 Tha Polltlct of Sclanca. 

THE WOILD or EDUCATION 
David llaa.aa on Naw Callai*^ llchard fraaaan and Harbarl Hojio-on 
an lha DacllnUi Valua af Collaia GaU|< Aniala Slant on tha ladcHffffa In- 
"itult Patar M Wau and lab'acca lAr|ulWao. A-arJc.'. L.adln, Pro- 
fasaiMal Schinla Wha'i Wha U Hlghar Education ^lohart Lakacl.a^ 

_ T2*"V.-!i l> I -L.- .B.r.^ M ah tlfilon Powor f 



on II^AcidlBlc La'lw Mar'kVtl ■ar^y Mlli.an on U«lon Powar In Aca- 
dasa Arnald Sawhlak an |oha Iradaaat^ 

AITS* LETTEIS 

Tha rutura af lha Ho.anllloaf Cnllon Murphy an Camput ■aat-Sallara 
Gar aid Motion oa lha Hn.anlallc laala off /Iclanllftc Work loan ■rfum on 
Anihany Inriata'a Clockwork Taila«anl ^Sara llackburn on Tho Acn« 
daalc Noval Harald Taylor on Sludanl Eipraialoa V^ Var.onl layalar on 
Tho Naw milaracy Eaay Klain on (ha Cualarii'Mada Taxibook 



60 



